UNIVERSITY 


Department of 
Mathematical Sciences 


Linear Optimisation and 
Numerical Analysis 
MX3503 


Ian Craw 





January 15, 2002, Version 4.2 


Copyright © 2002 by Ian Craw and the University of Aberdeen 


All rights reserved. 


Additional copies may be obtained from: 

Department of Mathematical Sciences 
University of Aberdeen 
Aberdeen AB9 2TY 


DSN: nrth202-103713-2 


Foreword 


What the Course Tries to Do 

It is always a good idea to start by saying what we are trying to do, and how you will 
recognise if you have succeeded! Modern education jargon sums up this by emphasising 
“aims” and “learning outcomes”. Here they are in the same form as in the “Catalogue of 
Courses”. It is hard to understand what the outcomes mean before you have finished the 
course; but at that stage, you should come back here and check that they do make sense. 
If they don’t, you may have missed something important. 

Aims 

The overall aim of the course is threefold: 

• to describe the simplex algorithm and show how it can be used to solve real problems; 

• to show how previous results in linear algebra give a framework for understanding 
the simplex algorithm; and 

• to place the simplex algorithm in a more general context by describing other calculus- 
based and computer based optimisation algorithms. 

Learning Outcomes 

By the end of the course the student should be able to: 

• formulate mathematically, optimisation problems specified in words; 

• recognise a linear programming problem, and in simple cases, to solve it using an 
appropriate form of the simplex algorithm; 

• justify the various steps in the algorithm (ignoring degenerate cases involving cycling) 
based on the prerequisite linear algebra; 

• solve more complicated linear programming problems using Maple; 

• state, prove and apply in simple cases, the fundamental theorem of duality; 

• solve simple matrix games, and describe how the solution derives from duality theory; 

• have an overview of nonlinear optimisation problems and be able to describe potential 
methods of solution, including simulated annealing and calculus based methods; 

iii 


• discuss genetic algorithms in the context of optimisation, describing choices to be 
made in the formulation; and 

• solve simple non-linear optimisation problems which are amenable to approaches using 
Lagrange multipliers or the Kuhn Tucker conditions. 

The Course 

The simplex algorithm is one of the most successful applications of modern linear algebra. 
It provides a routine way of solving certain types of maximising or minimising problems 
which occur frequently in practice. Although it can be reduced to a (relatively) simple 
algorithm, the emphasis in this course is on describing the reasoning behind the algorithm, 
to show how it comes out of the linear algebra that was studied last year. We shall not 
consider directly how it may be programmed efficiently; in practice, if you ever need to use 
the simplex algorithm, you will have access to a good implementation. 

As far as possible, we explore the use of the algorithms using relatively simple examples. 
However, working manually, it is hard to go much beyond matrices with (say) 4 rows and 
at most 8 columns without loosing the feel for the problem in a plethora of row operations. 
Thus for larger problems we use Maple, which easily handles larger matrices. We can thus 
illustrate more than a single feature of the general problem in a given example and give a 
more rounded or realistic set of examples. In practice we use a relatively simple subset of 
Maple’s facilities; Chapter 3 gives a review of all we need. As such this course should be 
accessible to students without previous experience with Maple. 

Although much of the course is devoted to the Simplex Algorithm and some of its 
applications, we conclude with a discussion of methods, both traditional and much more 
modern, which are suited to more general problems. This aims to put the detailed work in 
an appropriate applications context. 

Syllabus 

The following detailed course description was made available in a separate sheet, but it 
may be convenient to repeat it here. 

The course studies linear optimization methods in significant detail before moving on 
to more general non-linear problems, and will cover most of the following topics. 

Linear Optimization This occupies roughly three quarters of the course and consists of 

• Linear programming problems in practice. Examples including the transport 
problem. Revision of graphical methods of solution. 

• An introduction to, or revison of, the use of Maple to manipulate matrices, 
with particular emphasis on their construction and row-reduction. 

• The One Phase Algorithm. A standard form for linear programming problems. 
Solutions of (redundant) systems of linear equations, basic solutions and degen¬ 
erate solutions; tableaus and change of basis. The basic solution of a linear 
programming problem associated with a tableau. The one-phase algorithm, and 
a partial proof of its correctness. Examples. 


V 


• The Two Phase Algorithm. Introduction to the two phase problem. Artificial 
variables. Examples illustration the various possibilities. A partial proof of the 
result (ignoring cycling). Linear programming problems without non-negativity 
restrictions. 

• Duality Theory. Primal and dual problems. The bidual. Statement and proof 
of the fundamental theorem. Examples, and illustrations of the failure of more 
general versions of duality. Interpretation of duality 

• Game theory. An introduction to matrix games and conservative strategies. 
Pure and mixed strategies. Proof of the fundamental theorem using duality 
theory. Examples 

Non-linear Optimization This work is treated more generally and will study topics from 

• General introduction to non-linear optimisation. Brief discussion of traditional 
methods such as steepest descent. Computer based searching including simu¬ 
lated annealing. 

• Genetic algorithms: an example of a modern method suitable for a wide variety of 
problems. A description of the method. Gray coding and its relevance. Methods 
of generating initial populations; selection mechanisms. Mutation. Examples of 
use in practice. 

• Quasi-linear optimisation methods: how to relax linearity slightly. Lagrange 
multipliers and the Kuhn - Tucker conditions. 

These Notes 

Printed notes are designed to help you get the most from the lectures and accompanying 
tutorials which form the main part of the course. They contain the material that I use 
when preparing the actual lectures; in that sense they are my lecture notes. They also 
approximate what you as a student may choose to write down from these lectures; in that 
sense they are your lecture notes. And in each case, they form an approximation: a lecture 
is a form of communication; I will alter things in the lecture whenever I think a change will 
improve communication, and you may choose to write down things from the lecture that I 
have not put in these notes. 

”Lectures were once useful, but now when all can read, and books are so nu¬ 
merous, lectures are unnecessary.” Samuel Johnson, 1799. 

Lecture notes have been around for centuries, either informally, as handwritten notes, 
or formally as textbooks. Recently improvements in typesetting have made it easier to 
produce “personalised” printed notes as here, but there has been no fundamental change. 
Experience shows that very few people are able to use lecture notes as a substitute for 
lectures; if it were otherwise, lecturing, as a profession would have died out by now. To put 
it another way, “any teacher who can be replaced by a teaching machine, deserves to be”. 
So you should bear in mind that: 

• these notes are intended to complement the experience you get from attending the 
lectures; and 


VI 


• are available to supplement the notes that you take in the lectures. 

There is significant value in taking your own notes; you are much more likely to see what is 
going on as you do so. I hope that by having this version available as well, you can quickly 
correct any errors that occur as you take them. 

These notes have a long history; a course rather like this has been given within the 
Mathematics Department for at least 25 years. During that time many people have taught 
the course and all have left their mark on it; clarifying points that have proved difficult, 
selecting the “right” examples and so on. Dr Christopher Clapham was the first to formalise 
the course as a set of printed notes and this version was developed from them. It is now 
written in DT^X which allows a higher level view of the text, and simplifies the preparation 
of such things as the index on page 132 and numbered equations. You will find that most 
equations are not numbered, or are numbered symbolically. However sometimes I want 
to refer back to an equation, and in that case it is numbered within the chapter. Thus 
Equation (1.1) refers to the first numbered equation in Chapter 1 and so on. 

The Web Version 

Printed notes are convenient to use, but have the disadvantage that updating is expensive 
and inconvenient. In contrast the web is a very recent, but already almost universal medium, 
which offers convenient and very rapid updating. To take advantage of this, these notes are 
also available as a set of linked file on the web. Since I can update these files (over 750 when I 
last counted) easily, this format will always have the most recent version of the notes. They 
are available at http://www.maths.abdn.ac.uk/~igc/tch/mx3503/notes/notes.html. 

It is essential to use a graphics-based browser, because each piece of mathematics has 
to be delivered as a separate image. In the future browsers may be able to cope with 
mathematics, although I think this unlikely to happen quickly. The conversion from printed 
notes to HTML is automatic, and has some imperfections. However it is getting better; I 
am now reasonably confident that they say what I intended! 

A pdf (Portable Document Format) version is available from the Web. The file can then 
be viewed, typically in your browser, using Adobe’s freely available Acrobat reader. The 
whole document, or selected pages, can then be printed. This is likely to give much better 
quality output than printing directly from the web, since the pdf version is based on the 
original PostScript rather than the derived HTML. 

Much of this material lends itself to demonstration, so it is no surprise that there are 
many web resources available. In the HTML version of this document there are live links 
to some I have enjoyed. If you come across others, please let me know. And there seems 
no point in writing out the links in this printed version; they are of no use to you unless 
you have access to a browser and hence to the web version! 

The MX3503 Mailing List 

There is a mailing list associated with this class. You can subscribe to it by sending email 
to maj ordomoOmaths. abdn .ac.uk with a message that has an empty “subject” field and 
contains the single line subscribe mx3503-list. If you add your signature automatically, 
you can also add the line end after the subscribe request, and the signature will be ignored. 
You are encouraged to join this list. You then mail mx3503-list@maths.abdn.ac.uk to 
contribute to the list. 


vii 

I have always been happy to deal with questions by email, and have made a point of 
publishing my email address both on the web and in the printed notes. This list provides 
a more general way of communicating in which both questions and their answers go to 
everyone on the list. Here are some reasons why this might be useful: 

• It is rare for just one person in a class to have a given problem; by sending the answer 
to this list, others can benefit from it. 

• When a topic causes general concern, the lectures can be changed to cover it in greater 
detail, or pick up the problem area. 

• Often other members of the class can be of more help than the lecturer; this way 
everyone is aware of the problems and is invited to help or comment. 

• I have always been careful to make statements about the examination in front of the 
whole class — the list provides an equivalent public forum. 

Please note that this list is being maintained on the mathematics machines rather than 
the central University ones, so you need to mail maths.abdn.ac.uk to reach it. 

Finally some points of netiquette. 

• Please remember the usual courtesies; although this is email, it is still fairly public. 

• If you send email directly to me, I will not copy it to this list without first getting 
your permission. 

• The list is low security. Although you are technically able to impersonate someone 
else, please don’t do so. 

• Keep a copy of the message you get when you join the list to remind you how to leave; 
you are welcome to leave and re-join as often as you wish. 

Sometimes it is useful to pass a message to the whole class. I believe that for most 
people, this list is more useful way than posting it on a notice board. One natural use 

would be to cancel a lecture if that has to be done unexpectedly. The message may not 

reach everyone, but those who read it will be saved the inconvenience of turning up. 

Any more questions? Why not mail the list? You won’t be the only one with them. 

Computer Algebra Systems 

Part of this course involves the use of Maple to solve linear programming problems. I’m 
sure a knowledge of Maple will be valuable in your subsequent career; it is probably 
helpful to say a word about its relevance to this and other mathematics courses. Our aim 

in general is to teach principles and ideas. Examples are presented and questions set not 

because the answers themselves are interesting — although sometimes they will be — but 
with the aim of showing how the principles can be applied. So it is completely pointless 
to use a packaged routine which gives the answer, unless this helps you to understand the 
underlying method. You may well get answers using such a package in your subsequent 
professional career, but you should only do so provided you have a good understanding of 
the underlying principles. 


A computer algebra system is (essentially) infallible, but it can only answer the questions 
you ask. It is always useful to check you are asking the right questions by being able to do 
simple cases by hand. You can only do this if you have the sort of understanding we are 
trying to encourage in this course. You are welcome to use Maple to help with tutorial 
questions whenever you find it useful to do so; in other words, whenever you know exactly 
what you want to do, but are finding the manipulation heavy. 

Examinations will aim to test your understanding of the basic principles, rather than 
your ability to do long calculations accurately. Only very simple calculators are acceptable 
as electronic aids, and you should plan your use of Maple accordingly; it is a valuable 
tool, but you should be able to manage without it. 

This version of these notes was prepared in my study at home on a Maths Department 
Sun workstation. The examples included are accurate; I simply arrange for the output file 
from Maple to be processed automatically for inclusion in these notes. 

Books 

The recommended book for this course used to be Hartley (1985). It was cheap, well 
written and centred quite accurately around most of the material in the course. It is now 
out of print, but there are still copies in various libraries. An even older recommendation is 
Smythe & Johnson (1966), which has the advantage of being theoretically very thorough. 
A more modern book, clearly American but very nicely done is Winston (1995). It is not 
a formal recommendation because it covers much more than is needed, yet misses out a 
number of the modern methods that probably should be there. Another view is found in 
Press, Flannery, Teukolsky & Vetterling (1992) which is on the shelves of almost every 
practising engineer and industrial mathematician. Their view of the simplex algorithm, 
discussed briefly in Section 1.6, uses less linear algebra than we present here, and provides 
a nice contrast. 

A good modern view of the standard methods of looking at non-linear problems can be 
found in Kaplan (1999). It touches in part on linear optimisation, but its main interest is in 
the classical non-linear problems. A more standard book is (Kolman & Beck 1995); I would 
like to recommend this, but it is expensive for what is essentially a standard American 
college textbook. 

Press et al. (1992) also gives details about simulated annealing, one of the more modern 
methods we consider in the second part of the course. The other main one is the use 
of genetic algorithms, which is becoming increasing popular, and my original reference, 
(Goldberg 1989), is starting to look dated. The Maple “help” system is good, but you may 
want to read more; the natural place to start is (Char, Geddes, Gonnet, Leong, Monagan 
& Watt 1992). Throughout, I have quoted the edition of the copy of the book that I use. 
In general the edition is not critical, although clearly a later edition is more in sympathy 
with modern ideas about teaching. 

Tutorials 

Trying to learn mathematics by reading is like learning to play football by watching videos; 
the real test is how well you can do it yourself! Tutorial sheets will be made available during 
the course; the exercises there are intended to help you come to grips with the material. 
Solutions will subsequently be distributed; my aim here is to ensure you spend some time 


IX 


thinking about problems rather than immediately giving up, and looking at the suggested 
solutions. This is not however how you are likely to use the web, and so in the web version, 
exercises are immediately followed by their solution. 

There are some examples at the end of many of the Chapters, drawn from previous 
examination papers. You are strongly advised to try them yourself before looking at the 
(potential) solutions at the end of the notes. A number of past examination papers are 
available from the MX3503 homepage. You should make sure you have a proper attempt 
at each question before looking at the solution. You should be aware that official solutions 
are not made available. Many things can cause the solution to be wrong. One such is when 
the question is changed but the answer file is not updated. In the end you are responsible 
for ensuring that you understand the solutions and believe them to be correct. If you have 
problems in doing so, please seek help. 

Acknowledgements 

Over the years many students have helped to improve these notes by pointing out stupid¬ 
ities, repetitions misprints and so on. Since they have gone on the web, others, often in 
the USA, have contributed to this gradual improvement by taking the trouble to let me 
know of difficulties, either in content or presentation. I thank all who have done so, and 
am happy to receive comments from others. 


Ian Craw 

Department of Mathematical Sciences 
Room 344, Meston Building 
email: Ian. CrawSmaths . abdn .ac.uk 
www: http: //www.maths . abdn.ac.uk/~igc 
January 15, 2002 


X 


Contents 


Foreword iii 

Aims. iii 

Learning Outcomes. iii 

The Web Version. vi 

Computer Algebra Systems. vii 

Books. viii 

Tutorials. viii 

Acknowledgements. ix 

1 Introducing the Simplex Algorithm 1 

1.1 Constrained Optimisation. 1 

1.2 Some Sample Problems . 1 

1.3 A More Elaborate Example. 5 

1.4 Linear Programming Problems. 6 

1.5 Standard form . 7 

1.6 A Preview. 8 

1.7 Questions 1. 10 

2 The One-phase Simplex Algorithm 13 

2.1 Solutions of Systems of Linear Equations. 13 

2.2 Basic Solutions. 13 

2.3 Row-equivalent matrices. 14 

2.4 Computation of basic solutions. 16 

2.5 Change of basis. 17 

2.6 Return to the Linear Programming Problem. 19 

2.7 The One-phase Simplex Algorithm. 22 

2.8 A partial proof. 24 

2.9 Questions 2. 27 

3 Using Maple 29 

3.1 Getting Access to Maple. 29 

3.2 Using Maple to Check Results. 31 

3.3 Pivoting Using Maple . 35 

3.4 Cycling in Example 2.26. 36 

3.5 An Extended Exercise. 39 


xi 































CONTENTS 


xii 

4 The Two-Phase Simplex Algorithm 41 

4.1 Introduction. 41 

4.2 The Second Phase Described . 42 

4.3 Artificial Variables. 45 

4.4 A Partial Proof. 49 

4.5 Avoiding Non-Negativity. 50 

4.6 An Alternative View. 52 

4.7 Questions 3. 53 

5 Duality 57 

5.1 Formulation of the Dual. 57 

5.2 The Fundamental Theorem. 59 

5.3 Shadow Prices . 62 

5.4 The Dual Simplex Method. 63 

5.5 Questions 4. 64 

6 The Theory of Games 65 

6.1 Matrix Games. 65 

6.2 Pure strategies. 66 

6.3 Mixed Strategies. 67 

6.4 The Fundamental Theorem. 68 

6.5 Questions 5. 73 

7 Non-linear Optimisation Problems 75 

7.1 Relaxing linearity conditions . 75 

7.2 Maxima and Minima. 76 

7.3 Lagrange’s Principle. 79 

7.4 Inequality Constraints. 81 

7.5 Convexity. 84 

7.6 Questions 6. 86 

8 Genetic Optimisation 87 

8.1 A Simple Algorithm. 87 

8.2 Variations. 91 

8.3 Further Discussion. 92 

8.4 The Travelling Salesman Problem. 94 

8.5 Example Problems. 97 

8.6 Multiple Objective Functions. 100 

9 Simulated Annealing 101 

9.1 Introduction. 101 

9.2 An Example - the Travelling Salesman Problem. 103 

9.3 Minimising a Function. 104 

9.4 Comparison. 104 

9.5 Questions 7. 104 


Appendices 


106 




































CONTENTS 


xiii 

A Solutions to Exercises 107 

Solutions 1 . 107 

Solutions 2 . 109 

Solutions 3 . 112 

Solutions 4 . 118 

Solutions 5 . 120 

Solutions 6 . 124 

Solutions 7 . 127 

References 131 

Index Entries 132 









CONTENTS 


List of Figures 


1.1 Two dimensional optimisation problem. 2 

2.1 Flow chart for one-phase simplex algorithm. 23 

4.1 Flow chart for two-phase simplex algorithm. 42 

7.1 A dimensioned box. 79 

8.1 Two strings with two crossover sites. 90 

8.2 Offspring: the result of the mating shown in Fig8.1. 90 

9.1 Locating a global maximum can be hard. 102 

9.2 An apparently smooth minimum may be like this?. 102 

A.l The feasible region is above the parabola and below the circle.126 


xv 











LIST OF FIGURES 


List of Tables 


1.1 Machine shop costs. 3 

1.2 Productions times . 11 

1.3 Required muesli ingredients and selling prices. 12 

2.1 A tableau for the equation Ax = b . 17 

2.2 Basic Solution. 18 

2.3 More detail in a tableau for the equation Ax = b . 20 

2.4 Moving from one basic column to a new one. 21 

2.5 Three tableaus for Example 2.17. 22 

2.6 Example 2.20: three tableaus. 24 

2.7 Tableaus showing that Example 2.24 has no optimal solution. 26 

4.1 A two phase tableau. 41 

4.2 Two tableaus for Example 4.2. 43 

4.3 Example 4.3: pivoting about a bad row. 44 

4.4 Example 4.4 with no feasible solutions. 44 

4.5 Example 4.5: artificial variables. 46 

4.6 Four tableaus for Example 4.6. 47 

4.7 Pivoting on a bad row. 50 

5.1 Initial tableau for the primal Ax = b . 60 

5.2 Final tableau for the primal Ax = b . 61 

5.3 Duality example: the primal problem has no optimal solution. 62 

6.1 Initial tableau for Columnman’s problem. 70 

6.2 Final tableau for Columnman’s problem. 70 

6.3 Initial tableau for Example 6.14. 71 

6.4 Second tableau for Example 6.14. 71 

6.5 Final tableau for Example 6.14. 71 

6.6 Complete tableau for Example 6.15. 73 

6.7 One possible final tableau for Question 6.4. 74 

8.1 Binary codes. 88 

8.2 Gray codes. 88 

8.3 Initial Gene Pool. 90 

8.4 Mating Pool. 90 

8.5 After mating: this population forms the next generation, and has average 

fitness 439. 91 

xvii 


































xviii LIST OF TABLES 

8.6 A fitness function designed to test the propagation of “well positioned” blocks 
of 8 copies of 1. Only blocks whose positions reflect the underlying structure 

of the string are rewarded. 94 

8.7 Number of function evaluations needed to find the fittest string using differ¬ 
ent optimising algorithms. 94 

8.8 Payoff matrix for the Prisoner’s dilemma; the return to Rowman is given 

first, then the return to Columnman. 99 

9.1 Distances (or costs) between cities. 105 






Chapter 1 


Introducing the Simplex Algorithm 


1.1 Constrained Optimisation 

Almost the whole of this course is concerned with the following general question: 

Find the maximum, or minimum value of a function / = /(aq,..., x n ) of n real 
variables subject to the constraints that g*(aq,..., x n ) > 0 for i = 1,..., m. 

Such a problem will be referred to as a constrained optimisation problem, and the 
corresponding value as the optimum value. The function we are to optimise is called the 
objective function. The set of vectors x = (aq,... ,x n ) which satisfy the constraints is 
known as the feasible region. A vector at which the objective function attains its optimal 
value is known as an optimal feasible vector. Of course there is no guarantee that an 
optimal feasible vector exists for a given problem. 

Note first that there is no need to deal with maximising and minimising separately, since 
finding a minimum of /(aq,... , x n ) is the same as finding a maximum of —/(aq,..., x n ). 
Note also that if we have two constraints g\ > 0 and g 2 > 0 and in addition we know that 
<72 = — gi , then g± = 0; in other words an equality constraint can be reformulated as a pair 
of “>” constraints. 

As a simple example, consider the problem of maximising x 2 —5x+6 for 1 < x < 2. Such 
a problem is familiar from first year calculus; it can be written as a constrained optimisation 
problem by writing f(x) = x 2 — 5x + 6 and defining constraints g\(x) = x — 1 > 0 and 
g 2 (x) = 2 — x > 0. 

In the first part of the course, we concentrate on a special case of the problem in which 
both the objective function, and the constraints are linear functions. Such a problem is 
known as a Linear Programming Problem, or linear optimisation problem, 

1.2 Some Sample Problems 

We give a number of different examples of problems with linear constraints, aiming to show 
that a large class of “interesting” problems are of this type. 

1.2.1 Maximising a Function of Two Variables with Constraints 

Our first example can be solved quite simply by geometric methods. 


1 


2 


CHAPTER 1. INTRODUCING THE SIMPLEX ALGORITHM 


The problem is to maximise x + y subject to the constraints that 2x + y < 8 , x + 2y < 7 
and x — y > —2. This is illustrated in Fig. 1.1. The feasible region is shaded, and consists 
of the portion in the first quadrant (so x > 0 and y > 0), which lies below each of the three 
thick lines. These lines, y = x + 2, y + 2x = 8 and 2y + x = 7 correspond to when the 
constraint becomes tight. The objective function - the thing we are trying to maximise is 
x + y, and the three parallel lines represent lines x + y = k for different values of k. The 
largest value of k is attained at a point within the feasible region when the line just touches 
the vertex M; in the diagram, this corresponds to the largest value of k, with increasing 
values of k corresponding to lines which are further up and to the right. 



Figure 1.1: Two dimensional optimisation problem. 


1.2.2 Machine Shop Scheduling 

Our next task is to show that problems of this sort can occur in “practical” situations; 
that problems of interest in the “real world” lead to this type of constrained optimisation 
problem. 

A machine shop makes two products called (rather unimaginatively) A and B. Product 
A can be made with two options — as A\ and A 2 , while product B is available in options B\, 
£>2 and B%. The machine shop makes the two products using an appropriate combination 
of three machines, which can be used in any order. The production contract requires that 
60 units of item A and 85 units of item B be produced per week, although they can be 
produced in any of the various options. The objective of the exercise is to determine the 
product mix that is most profitable. The situation is summed up in Table 1.1. 

In order to write down in detail what is required, we need to introduce suitable variables. 

Choosing variables is often the hardest part of the whole process. One way is 
to think what you need to know in order to solve the problem — give the orders 
or instruct the foreman. Such variables are often known as decision variables 
because knowing their values enables a decision to be made. In this case, the 
decision is “how many of each option of each product do we make each week?” 

It is thus natural to introduce the following variables. Let x\ be the number of units 
of product A\ to be produced per week, X 2 be the number of units of product A 2 to be 
produced per week, X 3 be the number of units of product B\ to be produced per week, X 4 




1.2. SOME SAMPLE PROBLEMS 


3 


Product 

Option 

Unit production time 
on machine number 

Unit Profit 



1 

2 

3 


A 

1 

0.5 

- 

0.2 

2 


2 

- 

0.4 

0.2 

2.5 


1 

0.4 

0.3 

- 

5 

B 

2 

0.4 

- 

0.3 

4 


3 

- 

0.6 

0.3 

4 

Hours per week that ma¬ 
chines are available 

38 

31 

34 



Table 1.1: Machine shop costs. 


be the number of units of product B 2 to be produced per week and X 5 be the number of 
units of product B 3 to be produced per week. 

The profit from such a product mix is given by 

P = 2x\ + 2 . 5 x 2 + 5x 3 + 4 x 4 + 4xs, 

and this is the function (of xi,X 2 , ■ • • , X5) that we wish to maximise. The constraints are 
of three sorts: 


xi + x 2 =60, 
x 3 + x 4 + x 5 =85, 

0.5xi + 0.4x 3 + 0.4x 4 <38, 
0 . 4 x 2 + 0.3x 3 + 0 . 6 x 5 <31, 
0.2xi + 0.2x 2 + 0.3x 4 + 0 . 3 x 5 <34, 

Xi >0 


(Required production) 


(Machine time) 
for each i. (Reality) 


Solving this constrained optimisation problem then gives the values of x 4 , x 2 , ■ ■ ■, X 5 which 
give the most profit for this particular contract. 

1.1. Remark. Much of the remainder of the course is devoted to solving such problems. 
When you can, and have enough facility with Maple, come back to this problem. You 
should find that the problem is feasible and that the maximum profit is 520 units. 


1.2.3 A Transport Problem 

In the next introductory example, we give the data in purely symbolic form. As such this 
describes a class of problems, known as transport problems. 

A firm has warehouses W \, W 2 , ..., W m to supply retail outlets R \, R 2 , ..., R n with a 
certain product. The warehouse \\\ has a supply s,; of the product (i = 1,... , rn). measured 
in some convenient units, and we assume that all the supply is to be shipped to the retail 
outlets. In doing this, a demand dj at outlet Rj must be satisfied for each j = 1 ,...n. 
It is given that the cost of shipping the product from warehouse W t to retail outlet Rj is 
proportional to the amount shipped, and that shipping a unit amount costs Cij. 














4 


CHAPTER 1. INTRODUCING THE SIMPLEX ALGORITHM 


We formulate the problem of determining how much of the product should be sent from 
each warehouse to each retail outlet, so that all demands are satisfied, all the product is 
shipped, and the transportation cost is minimised. 

Again we have the problem of choosing variables which enable us to describe 
a solution. And again they are suggested by the information that you would need 
to pass to the manager of each warehouse 

Let Xij be the amount of product shipped from warehouse Wj to retail outlet Rj. The 
the total shipping cost is 

m n 

C = ^ ^ CijXij. 

i=l j=l 

Our supply constraint, that we ship all the product from warehouse Wi becomes 

n 

^ ^ %ij = Sii 
3 =1 

while the demand constraint, to meet the demand specified at retail outlet Rj is 

m 

yi x ij ~ dj • 

i =1 

Note we have the feasibility constraints that for each pair (i,j), > 0, since we must ship 

a non-negative amount of the product. 

The problem then becomes one of minimising the cost C subject to these constraints. 

1.2.4 A Blending Problem 

Wine from three European countries is to be blended. We express all costs in £ (perhaps it 
should be Euros?), so the three wines cost respectively C\, C 2 and C3 per litre. The wines 
are to be blended and sold for d per litre. The wines have acidities oi, < 22 , and 03 , and the 
blended wine must have an acidity < a. Assuming that acidity blends by volume, so that 
if the three wines are mixed in the proportions x\ : X2 '■ X3, then, by volume, the mixture 
has acidity a\X\ + 0:2X2 + 03X3. 

1.2. Remark. A separate question is whether the assumption is realistic. You could treat 
this as an excuse to do some practical work if you wish. Even if the assumption is false, 
there is a great temptation to behave as if it is true; because anything else is very much 
harder to manage. It should come as no surprise to you that most “scientific” assessment 
procedures behave as though their problem is linear even when it clearly isn’t: do you think 
that all the marks on a given exam question are equally easy to get? Note that our system 
invariably assumes that this is the case. 

To continue with the problem, the cost per litre of wine which is sold at d is Cixi + 
C 2 X 2 + C 3 X 3 . Our problem is to maximise the profit, which is thus 

3 

d ^ ( CiXi , 

1=1 

subject to the constraints that x* > 0 and Qixi + 0 : 2 X 2 + 03 X 3 < a. 


1.3. A MORE ELABORATE EXAMPLE 


5 


Another version This is the same problem, but there are only Qi litres of each wine 
available. In this case, we use as variables qi , the number of litres of each wine to be 
blended. The problem then becomes that of maximising 

3 3 

Cm 

i=l i=l 


The limited volume gives the constraint that qi < Qi for i = 1,2,3 while the acidity 
constraint is 


ct i - 


q i 


+ 02 - 


<?2 


+ « 3 ; 




< a. 


Y^3 1 ’ y^3 

2^i=l qi 2si=1 Qi /L/i=l qi 

As it stands, this last constraint is not linear, but can be made so by multiplying through 
by £<=i Qi- 


1.3 A More Elaborate Example 

In the past, the following example has been set as continuous assessment for this class. It is 
presented here as a relatively realistic example of how the simplex algorithm might be used 
in practice. You are invited to work the various parts of the example at the appropriate 
time during the course. At present you can do the “formulation” part. Here then is the 
“story”. 

You have been engaged by a manufacturing company because they value your expertise 
in Linear Programming. This may be a little premature, but they don’t need your report 
until after you have finished Section 3, when you will have the necessary expertise. The 
company is the Mendip Metals Manufacturing PLC. This example is borrowed, so I have 
deliberately left in the original location! And the Muchals Metals Manufacturing Company 
didn’t have quite the same foreign ring to it! The details provided by MMM are given in 
question 1.3 on tutorial sheet 1. 

Your eventual aim is to produce a report addressed to the MMM management, advising 
how much of each of the raw materials to buy, in order to maximise profits. You should also 
discuss the effects of changes in market conditions. You may in addition offer other advice 
based on your calculations. At this stage, you should show that the problem of deciding 
which, if any, of the alloys to manufacture, and from which raw materials, can be expressed 
as a Linear Programming Problem. Overall, the production of the report is divided into 
three parts: 

• formulating the problem; 

• obtaining a solution of the problem in Maple; and 

• obtaining a useful collection of solutions, and on the basis of your results, writing the 
report. 

There should be no mathematics in the body of the report. The mathematical formu¬ 
lation, and a brief statement of the solution, should go into Appendices, together with any 
sensitivity results you have calculated. Do not go into details of the simplex calculation, 
which should be done using Maple, and not by hand. The entire report will be assessed 





6 


CHAPTER 1. INTRODUCING THE SIMPLEX ALGORITHM 


by the Board (although not by me!) on the quality and clarity of your recommendations 
as well as the accuracy of the calculations. I would welcome seeing finished work, and if 
enough people get that far, will arrange a number of “competing” presentations to the class 
during the last lecture of the course. 

Some Advice Here is the advice I gave when the report was to be handed in as CA. You 
may still find it valuable if you are going to do this exercise “properly”. 

• Remember than problems often have a number of formulations, and that the first one 
you think of may not be the simplest. You can expect to spend quite a lot of time 
setting this problem up. 

• The actual calculations should be done in Maple, but no intermediate calculations 
need be shown. 

• Your report should be clearly organised. Number the pages. Explain the general 
structure of your report. Make your recommendations stand out. And think about 
an executive summary. You may wish to address some of your remarks to specific 
divisions of the company. 

• Legible handwriting or typing is acceptable, but you may also prepare your report in 
a word processor. If you bind your report, make sure that the writing is still visible, 
and does not disappear into the binding. 

• Ensure that, as consultant, your name appears on the report. It is possible that the 
board has called for more than one report on the same subject, and you would not 
wish your work to be confused with that of others. 


1.4 Linear Programming Problems 

With this introduction, we now give a fairly formal definition of the class of problem we 
are going to study. 

1.3. Definition. Suppose that one is given a linear (strictly an affine) function of n real 
variables 


2 = f(xi,x 2 ,... ,x n ) = cixi + c 2 x 2 + • • • + c n x n + d 
and a set of linear inequalities and/or equations, called constraints 


cliiXi + a\ 2 x 2 + . 

. . + CL\nXn 

< 

or 

= or 

IV 

o- 

a 2 ixi + a 22 x 2 + . 

• • n%n 

< 

or 

= or 

IV 

o- 

to 

®ml®l Y ®m2^2 T" ■ ■ 

• “1“ d mn x n 

< 

or 

= or 

T bm j 


where in each line either <, = or > occurs. The problem of finding x in f", where x = 
(x\,x 2 ,... ,x n ), that satisfies the constraints (1.1) and makes z a maximum (or minimum) 
is called a Linear Programming Problem . We saw in Section 1.1 that, if it is convenient, 
we can always restrict attention to just having “<” signs, at the expense of having more 
constraints. 


1.5. STANDARD FORM 


7 


We shall assume that every Linear Programming Problem has included in its constraints, 
the non-negativity restrictions 


( 1 . 2 ) 


Xj > 0 for j = 1 , 2 ,... , n. 


and these will be written separately from the other constraints. We will see, in Section 4.5 
that this does not in fact limit the class of problems that can be handled by the meth¬ 
ods to be discussed. These non-negativity constraints are sometimes known as reality 
constraints. In examples they typically represent quantities that for physical reasons are 
non-negative. 

Any x satisfying the constraints (1.1) and inequalities (1.2) is called a feasible solution. 
The set of feasible solutions is the feasible region. Somewhat perversely, any x satisfying 
the constraints (1.1) but not (1.2) is called a non-feasible solution. 

The function / is called the objective function and z the objective variable. If 
x is a feasible solution that makes f(x i,... ,x n ) a maximum (or minimum) then x is an 
optimal solution and the corresponding value of z is the optimal value. 

Finally a convention; an element x in W 1 will be treated as a row vector or as a column 
vector according to the context. 

1.5 Standard form 

1.4. Definition. When written as: maximise (or minimise) z = x n + 1 subject to 



for i = 1 ,2,... , rn and Xj > 0 for j = 1,2,... , n. (1-3) 


a Linear Programming Problem is said to be in standard form. 

Using matrix notation, this standard form can be written: maximise (or minimise) 
x n+ i subject to Ax = b, and Xi > 0 for i = 1,2,... , n, where x E M n+1 and b E . 
Notice that in the standard form the objective variable is equal to x n _|_i and this has no 
non-negativity restriction; but we will still say that the Linear Programming Problem has 
non-negativity restrictions. Moreover we write x = (xi,X 2 , • • • ,x n -,x n+ i) and if x satisfies 
all the conditions in (1.3) we say that x is a feasible solution. It is not hard to see that, 
every Linear Programming Problem can be put into standard form in the following sense. 
For any Linear Programming Problem P, there is a Linear Programming Problem P' in 
standard form, such that: 

• a feasible solution of P corresponds to a feasible solution of P', both solutions giving 
the same objective value, and conversely; 

• any optimal solution of P corresponds to an optimal solution of P\ both yielding the 
same optimal value, and conversely; and 

• if P has no feasible solution then P’ has no feasible solution, and conversely. 

The Linear Programming Problem P’ is called a related standard form for P. Rather 
than prove this formally, we illustrate with an example. 


8 


CHAPTER 1. INTRODUCING THE SIMPLEX ALGORITHM 


1.5. Example. Express the problem of maximising 2 = 5xi — x 2 + 2x3 + 4 subject to 


Xi - 3x 2 - %3 

< 

8 

2 xi — 2 x 3 

> 

5 

Xi - 6x2 + X 3 

= 

-7 

Xl,X 2 ,X 3 

> 

0. 


in standard form. 

Let P be the above Linear Programming Problem. Then a related standard form P' for 
P can be obtained by introducing variables X4 and X5 in the two inequalities and writing 
z = x 6, to obtain: maximise xq subject to 

x\ — 3x2 — X3 + X4 = 8 

2xi — 2 x 3 — X 5 = 5 
xi-6x 2 + x 3 = -7 
—5xi + x 2 - 2x 3 + x 6 = 4 

Xl,X2,X 3 ,X4,X 5 > 0. 

The variables, here X4 and X5, introduced into the inequalities to obtain a related standard 
form, are called slack variables. 

1.6 A Preview 

Our aim in the course is to use known linear algebra theory to put the simplex algorithm 
into a firm theoretical context. To this end we shall use the above transformation to related 
standard form to work with matrices as our fundamental object rather than inequalities 
or equations; such an abstraction has significant gains. However there is a perfectly valid 
description of the algorithm in terms of equations which we now describe. You may find it 
gives you an interesting alternative view of the whole process. 

1.6.1 An Example 

We illustrate the general case with an example, taken from Press et al. (1992, Section 10.8, 
Page 433). This reference provides a good description of the whole process to the stage 
of deriving an efficient algorithm for doing the computations. In my view the description 
gets very close to the title of their book — it is primarily designed as a recipe; but their 
explanation does proceed in terms of equations. The problem is: 

Maximise 2 = 2xi — 4x2 subject to the reality constraints xi > 0, X2 > 0, 

X3 P 0, X4 P 0 and 


x 3 = 2 - 6x1 + x 2 , 
X 4 = 8 + 3xi — 4x2- 


This problem is in a rather special form; we start by analysing it. Note that the variables in 
the problem fall into two classes; those that appear on the right hand side of the equations, 


1.6. A PREVIEW 


9 


namely x\ and x 2 , and the other variables (z , x 3 and X4) which appear on the left hand 
side of the equation. We shall, in an obvious way, call these the right hand variables and 
the left hand variables respectively. So far there is nothing special. Note that each left 
hand variable, including the objective function, z, appears in only one equation, and that 
each equation involves a single left hand variable. Finally note that all the constants in the 
constraints are positive, and that each left hand variable has coefficient + 1 . An objective 
function and a set of “equality” constraints in this form is said to be in restricted normal 
form. 

Note that since each constraint has an associated left hand variable, there are at least 
as many variables as constraints — there are actually more variables than constraints if 
there are any right hand variables. Our experience solving linear systems suggest that this 
is the “usual” situation in which we can expect a solution; had there been more constraints 
than variables we would have expected an “over-determined” system which is in general 
inconsistent. 

We are familiar with a system in this form when solving a set of equations using Gaussian 
elimination. Having diagonalised the left hand side of the system as much as possible, it is 
easy to solve it — simply move the remaining variables to the right hand side, give them 
arbitrary values, and solve for each of the left hand variables. In our case, we obtain a 
feasible solution to the optimisation problem by setting each right hand side variable to 
0 , and then solving for the left hand side variables. Since the constant in each constraint 
equation is non-negative, the complete solution is necessarily feasible. In our case we have 
the solution 


x\ = 0, X2 = 0; x 3 = 2, X4 = 8. 

With these values, the objective function has the value z = 0. This wasn’t special; in the 
same way, we can guarantee to write down a feasible solution to any problem in restricted 
normal form. 

1.6.2 Improving the Solution 

Can we find a larger value of the objective function than 0 ? Since we had some freedom 
in assigning the values of x\ and X2, can we do better? 1 Looking again at the objective 
function, namely z = 2xi — 4x2, it is clear there is no gain from trying to increase X2, since 
that will only decrease z. In contrast increasing x\ may be useful. One way to do that is to 
arrange for x\ to be a left hand side variable. We do so by rearranging the first constraint: 

x 3 = 2 - 6x1 + x 2 so xi = | - |x 3 + |x 2 . 

We can now return our system to restricted normal form by using this transformed equation 
giving xi as a left hand variable to substitute for xi whenever it occurs as a right hand 
variable. We have 

z = 2xi - 4 x 2 so z = 2[| - |x 3 + |x 2 ] - 4 x 2 = § — |x 3 - yl 2 , 

X4 = 8 + 3 xi - 4 x 2 SO X4 = 8 + 3 [| - \x3 + ^X 2 ] - 4 x 2 = 9 - yX 3 - \x 2 . 

1 We didn’t have complete freedom in assigning xi and X2; each had to be non-negative, but we also had 
to choose them so the left hand variables came out to be non-negative. 



10 


CHAPTER 1. INTRODUCING THE SIMPLEX ALGORITHM 


Our optimisation problem is now back in restricted normal form since it turns out that 
each constraint equation has a non-negative constant So we can obtain a feasible solution 
as before; this time we have: 

X 3 = 0, X 2 = 0; x\ = ■§, X 4 = 9. 

With these values, the objective function has the value z = the change of left hand 
variables has resulted in an improved value for z. 

We now have 


* = I - 3 X 3 - T X 2, 

and since X 3 > 0 and X 2 > 0, necessarily z < |. Clearly then we have found the maximum 
value of z and so finished the problem. 

This looks like a very special method, being applied to a very special version of the 
general Linear Programming Problem. However we saw in Section 1.5 that introducing 
slack variables can easily generate what we have just called left hand variables. Consider 
the problem of maximising 2 x\ — 4 x 2 subject to the constraints that 

6x1 — X2 < 2, 

—3xi + 4x2 < 8. 

Introducing X3 and X4 as slack variables in the first and second inequalities respectively 
gives the problem we have just discussed, and it is already in restricted normal form! Our 
aim in what follows is to show that 

• all Linear Programming Problems can be reduced to restricted normal form; 

• if a maximum exists and hasn’t been found, the objective function can be improved 
by swapping right hand and left hand variables as above; and 

• if a maximum exists, it will be found after a finite number of steps. 

In fact we have no more use for the ideas of left and right hand variables, or of restricted 
normal form, except when discussing this formulation of the problem. 

1.6. Example. We used the first constraint to make xi a left hand variable. Show that had 
we used the second constraint to make xi a left hand variable, thus making X 4 a right hand 
variable instead of X3, the corresponding solution would not have been feasible. Strictly, 
the resulting system would not have been in restricted normal form. 

1.7 Questions 1 (Hints and solutions start on page 107.) 

1.1. Q. A paper mill produces rolls of paper each of which is 18 ft. wide. These are then cut into 
various widths as required by customers. A roll can be cut into narrower rolls as many times as 
necessary. 

a) A customer requires rolls in widths of 9 ft., 7ft. and 5 ft. In how may different ways can 
the 18 ft. rolls be cut to yield (one or more of) these sizes? In each case, say how much waste is 
created. 


1.7. QUESTIONS 1 


11 


b) An order is received for 10 rolls 9 ft. wide, 20 rolls 7 ft. wide and 50 rolls 5 ft. wide. 
Formulate a linear programming problem to fill the order using the minimum number of rolls. Do 
not solve this problem, but say what difficulties you see in putting the solution into practice. 

c) Now suppose that surplus rolls less than 5 ft. wide are sold for £k per ft., so that a roll of 
width w < 5 is sold for £kw. Suppose also that a standard roll costs £P to produce, and that each 
cut costs £C. Formulate a linear programming problem to satisfy the requirements at minimum 
net cost. 

1.2. Q. A factory makes three types of small decorative garden sculpture, known as Bashful, Dozy 
and Happy. Their manufacture is done using three different machines, called A, B and C, which 
can be used in any order. The number of hours needed on each of the three machines to make each 
sculpture is given in the following table 



A 

B 

C 

Profit/unit 

Bashful 

2 

1 

2 

£2 

Dozy 

1 

0 

3 

£4 

Happy 

0 

3 

2 

£3 


which also gives the net profit made on each of the sculptures. Machine A is available 
per week, machine B for 37 and machine C for 42 hours per week. 

The problem is to decide how many of each type should be made each week in order 
the profit on the operation. Formulate the problem as a linear programming problem, 
asked to solve the problem.] 

1.3. Q. A pharmaceutical company is creating a tablet for a new drug. Each tablet is to contain 
a binder, a disintegrant and a filler in addition to the active drug ingredient, which is to be 14% 
of the weight of each tablet. Chemical and physical considerations mean that the weight of the 
disintegrant should not exceed 25% of the combined weights of the binder and the active ingredient, 
and that there should be at most 10 times as much filler as binder. The disintegrant costs £15, the 
binder £50 and filler £2 per kilogram. 

The problem is to decide how to formulate the tablet in order to minimise its cost. Express the 
problem as a linear programming problem. [You are not asked to solve the problem.] 

1.4. Q. A manufacturer makes a range of three types of car, A, B and C in two factories; an engine 
plant E and a body factory F. The nett profit on a car of type A is £1100, on type B it is £1200 
and on type C it is £1450. There are 10120 labour units (a labour unit is approximately one “man 
hour”) available in E and 11000 labour units in factory F each month and the number of labour 
units, needed to build the various products is given in Table 1.2. 

Car type Engine Body 

A 8 8 

B 8 9 

C 9 11 

Table 1.2: Productions times 

Formulate the problem of maximising the profit subject to these constraints as a linear pro¬ 
gramming problem. [You are not asked to solve it.] 

One purported solution of this problem shows no cars of type B are made. Is this plausible? 
Describe briefly three ways in which this mathematical model of car production might be modified 
to be more realistic. 


for 43 hours 

to maximise 
[You are not 






12 


CHAPTER 1. INTRODUCING THE SIMPLEX ALGORITHM 


1.5. Q. A Natural Food store makes up three types of muesli, which it sells as “Crunchy”, Healthy” 
and “Rich”. Each type is made by mixing different proportions of cereals, nuts and dried fruit. 
Existing advertising material means that the restrictions and selling prices given in Table 1.3 are 
fixed. 


Type 

Cereal 

Dried Fruit 

Nuts 

Selling price 
per kilo (£) 

Crunchy 



At least 60% 

1.60 

Healthy 

At least 60% 


At most 20% 

1.20 

Rich 

At most 20% 

At least 60% 


2.00 


Table 1.3: Required muesli ingredients and selling prices. 

The suppliers can deliver at most 100 kilos of cereal at £0.80 per kilo, 80 kilos of dried fruit 
at £1.50 per kilo and 60 kilos of nuts at £1.00 per kilo each week. The store is able to sell all 
the muesli that it mixes. Formulate the problem of finding the most profitable mixing scheme as a 
linear programming problem. You are not asked to solve the problem. 






Chapter 2 


The One-phase Simplex Algorithm 


2.1 Solutions of Systems of Linear Equations 

From here until Section 2.6, we shall forget about the objective function and just consider 
the constraints. 

For a Linear Programming Problem in standard form, these are just a set of equations 
Ax = b, and so we are studying such systems of linear equations. The mxn inhomogeneous 
system Ax = b can be written as 

aqai + x 2 a 2 + • • • + x n a n = b 

where a j is the j-th column of the matrix A. Thus any solution of the equations is essentially 
an expression of b as a linear combination of the columns of A. 

2.1. Definition. The column space of A, denoted by Ma, is the subspace of M m spanned 
by the columns of A. 

There are just three possibilities: 

1 . the column b 0 Ma and so there are no solutions; 

2 . the column b £ Ma and ai,a 2 ,... , a n are linearly independent, in which case there 
is a unique solution x; or 

3. the column b £ Ma and ai,a 2 ,... ,a n are linearly dependent, in which case there 
are infinitely many solutions. 

Clearly case (3) is the only case that is of interest for Linear Programming Problems, 
in contrast to our usual interest in uniqueness. Different solutions of the set of equations 
correspond to possibly different values of the objective function; and as such we can ask 
which of the solutions gives the best answer. 

2.2 Basic Solutions 

We assume that b £ Ma and that ai,a 2 , ... ,a n are linearly dependent. 

Let B = {a^j, a/j 2 ,... , a^}, where 1 < hi < n for i = 1,2,... , k. Suppose that B is a 
basis for Ma, so that k = dim Ma- 


13 


14 


CHAPTER 2. THE ONE-PHASE SIMPLEX ALGORITHM 


2.2. Definition. The solution associated with the basis B is the unique vector x = 

n 

(aq, X 2 , ■ ■ ■ , x n ) such that b = x j a j with Xj = 0 if a j 0 B. 

3 =1 

A solution x is basic if it is the solution associated with some basis of Ma , chosen 
from the columns of A. If a j belongs to the basis under consideration, we call a j a basic 
column and the corresponding Xj a basic variable. 

2.3. Lemma. The following hold for basic solutions of Ax = b: 

• the number of basic solutions of Ax = b is finite; and 

• a solution x is basic if and only if the set {aj : Xj ^ 0} is linearly independent. 

Proof. We get a basic solution for each basis B, so we must count the number of bases for 
Ma consisting of columns of A. Assuming that dimA = k, we can have no more than 


71 k k\(n — k)V 

since this is the number of ways of picking any k columns from the possible n columns 
from A. It may be that some of the sets of k columns of A are not linearly independent, in 
which case we get fewer bases, but in any case, the number is finite. 

If x is a basic solution, we can write x = Yl x j a j i n terms of all the columns of A. 
There is some basis B consisting of columns of A such that x,j = 0 if a. ; B. Thus 


B A {a j | Xj / 0} , 

and the right hand side, being a subset of a basis, is certainly linearly independent. 

Conversely if {aj | Xj ^ 0} is linearly independent, we can extend it to get a basis B of 
Ma consisting of columns of A. Clearly x is then basic. □ 

2.4. Definition. A basic solution x is degenerate if {aj : Xj ^ 0} is a proper subset of 
a basis for Ma- 

Thus if a degenerate basic solution x is associated with a basis B then at least one of 
the basic variables is zero — and we generate x without using as many basic columns as 
might have been expected. 

2.5. Definition. A basic feasible solution is a solution that is both basic and feasible. 

2.3 Row-equivalent matrices 

We say that the matrices A and C are row-equivalent if one can be obtained from the 
other by a succession of operations of the following kind 

• interchanging two rows; 

• multiplying a row by a non-zero constant; or 

• replacing a row by the sum of it and (a non-zero multiple of) another row. 



2.3. ROW-EQUIVALENT MATRICES 


15 


Recall that these operations are known as row operations and that they have an in¬ 
terpretation directly in terms of matrix algebra. We write R( A) for the matrix obtained 
by performing the row operation R on the matrix A. Each row operation R defines an 
elementary matrix E/j = R(I n ) by performing the operation R on the identity matrix \ n . 
A calculation then shows that E/j • A = R{ A). Recall also, or check directly, that each E/j 
is invertible. Thus if A and C are row-equivalent matrices, there is a sequence Ei, E 2 , ..., 
Efc of elementary matrices such that C = E 1 E 2 ... E^A, and so, in particular, an invertible 
matrix P such that C = PA. 

Let [A|b] be the augmented matrix associated with Ax = b. Note that the column 
space Ma associated with the matrix A is the same as the column space of [A|b], precisely 
because of our assumption above that b € Ma- 

2.6. Theorem. If [A|b] and [C|d] are row-equivalent then Ax = b and Cx = d have the 
same set of solutions. 

Proof. Since [A | b] and [C|d] are row-equivalent, there is an invertible matrix P such that 
PA = C, and Pb = d. Then 

Ax = b PAx = Pb because P is invertible 

Cx = d. 


□ 

2.7. Corollary. If A and C are row-equivalent then Ax = 0 and Cx = 0 have the same 
set of solutions. 

Proof. If A and C are row equivalent, then [AjO] is row equivalent to [C|0], since PO = 0 
for any invertible P. The result thus follows from 2.6. □ 

2.8. Corollary. Let A = [ai,... ,a n ] and C = [ci,... ,c n ] be m x n row-equivalent 
matrices. Then 

n n 

Xi&i =0 if and only if XjCj = 0 . 
i= 1 2 = 1 

Proof. This is just a restatement of the previous result. □ 

2.9. Corollary. Let {a^,... , a^ } be a set of columns of A and let the corresponding set 
of columns of C be {c^,... , c h k }- Assume that A and C are row-equivalent. Then 

k k 

1. a j = X,a ht if and only if c, = ^ \iC hi ; 

2=1 2=1 

2. The set {a/ M ... , a h k } is a basis for Ma if and only if { c/ ll ... , c h k } is a basis for Me- 


3. dim Ma = dim Me- 


16 


CHAPTER 2. THE ONE-PHASE SIMPLEX ALGORITHM 


2.4 Computation of basic solutions 


2.10. Definition. Let be the m-dimensional unit vector (0,... , 1, ... ,0) with a 1 in 
the k -th place and 0’s elsewhere. The m x n matrix A is in canonical form if all the 
vectors occur as columns of A, except that e*. need not occur if the k -th row of A is 
zero. We say [A|b] is in canonical form if A is. 


Now suppose that solutions to Ax = b do exist. Then each matrix [C|d] that is row- 
equivalent to [A|b] and is in canonical form determines, using 2.9, a basic solution for 
Ax = b. We say that this is the basic solution associated with [C|d] . We show how this 
can occur with an example. 

2.11. Example. Let 


[A|b] 


6 -16 -3 1 

-7 21 4 -1 

-3 9 2 -1 



Given that [C|d] is row-equivalent to [A|b], 
solution of Ax = b. 


/ 0 2 

and [C|d] = I 1 —1 

\ 0 4 

interpret Corollary 2.9 


0 1 
0 0 
1 0 


1 

7 

10 


by finding a basic 


Solution Note first that [C|d] is in canonical form; columns 1, 3 and 4 are respectively e 2 , 
e 3 and ei. We have ci = e 2 , C 3 = e 3 and C 4 = ei. Clearly 


d = 




7 ci + IOC3 + IC4. 


Hence b = 7ai + 10 a 3 + la 4 , from 2.9(1). Explicitly, we have 



7ai T 10a3 -1- la4. 


Since { 04 , 03 , 04 } is a basis for Me, {ai,a 3 ,a 4 } is a basis for Ma , by 2.9(2). Thus we 
obtain x = (7, 0,10,1), a basic solution for Ax = b. 

From now on we shall often make the implicit assumption that Ma = IR m . In the 
notation used earlier, this is the assumption that k = m. 

2.12. Definition. A tableau for Ax = b is a matrix [C | d] in canonical form, row- 
equivalent to [A| b] , with, listed down the side, the ordered basis of Ma used to determine 
the basic solution associated with [C|d]. 

A tableau for the equation Ax = b is illustrated in Table 2.1. 

m n 


Notice that b = ^^Uja^ , and hence b = Xj&j , where Xj = Ui if j 
i =1 i=i 

i = 1,... ,m and Xj = 0 otherwise. Notice also that, in a similar way, a j 
where yij is the (i,j)-th element in the array. 


hi for some 

m 

Vij a hi , 

i =1 




2.5. CHANGE OF BASIS 


17 



ai a-2 

a hi 

Rj ... a n 

b 

3-h± 


0 

yij 

u\ 

a /l2 


0 

V2 j 

U2 

a hi 


1 

yij 

Ui 

^hm 


0 

Vmj 



Table 2.1: A tableau for the equation Ax = b. 

The standard way of reducing a matrix to canonical form is by pivoting. Let P = \pij\. 
Choose any nonzero entry p rs . This is taken as the pivot element and the r-th row is the 

pivot row. Now 

• multiply the pivot row by 1 /p rs , and 

• add ~Pks times the (new) pivot row to the k-th row, for all k ^ r. 

This results in a row-equivalent matrix Q such that the s-th column of Q is e r and any 
vectors e^ with k ^ r that were present as columns of P remain as columns of Q. It follows 
that starting with [A|b] and pivoting successively at convenient entries in A only, a matrix 
in canonical form, row-equivalent to [A|b], is obtained. 1 

2.13. Example. Illustrate this process with the set of equations 


6xi 

- 

16x 2 

- 

3x 3 

+ X 4 = 

13 

— 7xi 

+ 

21x 2 

+ 

4x 3 

— X 4 = 

-10 

— 3xi 

+ 

9x 2 

+ 

2x 3 

— x 4 = 

-2. 


Solution We show the process by which one obtains the basic solution x = (7,0,10,1) 
associated with the basis ai,a 3 ,a 4 in Table 2.2. Here, and elsewhere, the pivot element 
will be marked with an asterisk. 

Note that the final part of Table 2.2 gives us, according to definition 2.12, a tableau for 
the original set of equations. 

2.5 Change of basis 

In following the simplex algorithm for linear programming, we move from one basic solution 
to another. This amounts to changing from one basis to another. We do this by changing 
one basis vector at a time. 

2.14. Definition. Let I? be a basis for a vector space V and v a vector in B. Then a 
vector u of V can replace v in B if B \ {v} U{u} is a basis for V. 

1 This is one place where we make the implicit assumption that Ma = R m ; otherwise we could not 
guarantee that every ecould be produced , as required to get a matrix in canonical form. 







18 


CHAPTER 2. THE ONE-PHASE SIMPLEX ALGORITHM 



ai 

a 2 

a 3 

a 4 

b 


6 

-16 

-3 

1* 

13 


-7 

21 

4 

-1 

-10 


-3 

9 

2 

-1 

-2 

3.4 

6 

-16 

-3 

1 

13 


-1 

5 

1 

0 

3 


3 

-7 

-1* 

0 

11 

<3.4 

-3 

5 

0 

1 

-20 


2* 

-2 

0 

0 

14 

&3 

-3 

7 

1 

0 

-11 

a 4 

0 

2 

0 

1 

1 

ai 

1 

-1 

0 

0 

7 

&3 

0 

4 

1 

0 

10 


Table 2.2: Obtaining the basic solution x = (7, 0,10,1) associated with the basis ai, a 3 , a 4 . 


All this is saying is that if we remove v and put u in its place, we are interested in 
whether we still get a basis. The new set of vectors may be linearly independent, in which 
case we have another basis, or linearly dependent, in which case we may not replace v by 
u. Here is a useful criterion to decide when we can make such a change. 

2.15. Theorem. The vector u can replace v in B if and only if, when u is expressed in 
terms of B the coefficient of v is nonzero. 

Proof. Suppose first that when u is expressed in terms of B the coefficient of v is nonzero. 
Since a basis is a spanning set, we can write 


n 

u = Av + ^2 AjVj 
i— 1 

for some coefficients A and and we are given A / 0. In order to show that B' = 
B \WU{"} is a basis, it is enough to show it is linearly independent, since it has the 
same number of elements as the basis B. So assume that 

n 

/ru + ^ HiVi = 0; 

7=1 

we show that each coefficient vanishes. From the given expression for u in terms of B, we 
have 


n 

[iXw + ^OA i + pLi)vi = 0 , 

1=1 

and since this is a linear relation between members of the basis B, each coefficient must 
vanish. In particular, since A ^ 0, we have fi = 0, and so also fii = 0 for all i. Thus the set 
B' is linearly independent. 









2.6. RETURN TO THE LINEAR PROGRAMMING PROBLEM 


19 


Conversely, suppose that the vector u can replace v in B, and define B' as above. Since 
it is a basis, it spans the space, so we can write 


n 



and y is non-zero, since otherwise we have 


n 


v = 


i =1 


and this is a non-trivial linear combination of (linearly independent) elements from the 
basis B, which cannot occur. 

Thus y is non-zero, we can write 



and the coefficient — is non-zero as required. 


□ 


Now, in a tableau for Ax = b, let a/ lr be one of the basic columns and a s some column 
of A. Then a s can replace a h r in the basis {a^a^,... , a^ m } if and only if y rs ^ 0. If 
y rs / 0, then a/ lr is replaced by a s by pivoting at y rs . In other words, if we wish to change 
the basis so as to include a s , when choosing a row to swap out of the basis by pivoting, we 


only consider rows for which the corresponding entry is non - zero, in order to keep a basis. 


2.6 Return to the Linear Programming Problem 

Now let us consider the problem: maximise (or minimise) x n . |_i subject to Ax = b, and 
xi,X 2 , ■ ■ ■ ,x n > 0. We shall assume that the final column of A is a n _|_i = (0,0,... , 1), 
in view of the way in which the standard form of a Linear Programming Problem is set 
up. Henceforth a n+ i will also be called e and we make the rule that, in any tableau T for 
Ax = b, the vector e must be in the basis as the last member of the basis. The tableau T 
thus takes the form shown in Table 2.3. 

The entries in the last column of the tableau give the associated basic solution. The 
entry 2 is precisely the value of the objective variable, also denoted by x n+ \. The associated 
basic solution is feasible if and only if Ui > 0 for i = 1,2 ,... , m. 

Suppose it should happen that for some s / h r , we have y rs = 1 and yi s = 0 for i / r. 
Then replacing a h r by a s in the basis has essentially no effect. We shall assume that this 
unlikely occurrence never arises. 

2.16. Theorem. Suppose that the basic solution associated with a tableau is feasible. Then 
replacing &h r by a s in the basis yields 

1 . a distinct basic feasible solution if and only if y rs > 0 and 


u r 

Urs 



and 


20 


CHAPTER 2. THE ONE-PHASE SIMPLEX ALGORITHM 



ai &2 • • • a s ... a n 

e 

b 

3-h± 

yu 

0 

Ul 

a/i 2 

V2s 

0 

U2 

a hi 

Vis 

0 

Ui 

&h r 

Vrs 

0 

U r 


Urns 

0 

Um 

e 

Vl V 2 ... v s ... V n 

1 

Z 


Table 2.3: More detail in a tableau for the equation Ax = b. 
2. the same basic feasible solution if and only if y rs ^ 0 and u r = 0. 


Proof. In order to replace &h r by a s in the basis, we pivot at y rs in the tableau. This is 
shown in Table 2.4, where we do the pivot in two steps for simplicity. 

We start with a basic feasible solution. In order to ensure that we keep a basic solution 
at each stage of the pivot, it is essential by Theorem 2.15 that y rs 0. We now examine 
what is needed in order to ensure we have a new basic feasible solution. 

Note first that if u r = 0, then although we have moved to a new basic feasible solution, 
it is not distinct from the old one. For the old basic feasible solution has x^, = Ui for all i, 
and Xi = 0 whenever a* is not in the basis. In particular, Xh r = 0 since u r = 0 and x s = 0, 
since a s is not in the basis. After replacing a h r by a s in the basis, the same result still 
holds, and none of the other parts of the basic solution have changed. We thus assume in 
what follows that u r ^ 0. 

Clearly we must have u r /y rs > 0 for feasibility, otherwise the requirement that x s > 0 
fails. Since we started with a feasible solution, u r > 0, this then means that y rs > 0 is a 
necessary condition for the new basic solution to be feasible. The other requirement for 
feasibility is that the remaining members of the b column are non - negative, so 

Ui - yisiur/yrs ) >0 for i = 1,2 ,... m, i / r. 

If Vis < o, this condition certainly holds, since u r /y rs > 0, so we are increasing m when 
we make the change; thus the solution stays feasible. 

It remains to consider the situation when yi s > 0, in which case, we must have 

Ui - yis{u r /y rs ) > 0, or — > —, 

Vis Urs 

and this must hold for each relevant i. Note also that the division performed to get the 
second form of the above condition is valid, since we are assuming that > 0. Thus, 
in order that the new solution should remain feasible, the minimum value of Ui/yi S: taken 
over all indices for which > 0, occurs when i = r. □ 








2.6. RETURN TO THE LINEAR PROGRAMMING PROBLEM 


21 



a i a2 ... a s ... a n 

e 

b 

a hi 

yis 

0 

Ul 

a /i2 

V2s 

0 

U2 

a + 

Vis 

0 

Ui 

&h r 

1 

0 

/ Vrs 


Urns 

0 

Urn 

e 

Vl V 2 ... v s ... V n 

1 

Z 

&hi 

0 

0 

ui - yis{u r /y rs ) 

^■h 2 

0 

0 

U 2 - y 2 s(u r /y rs ) 

a + 

0 

0 

'U'i Visi^r / Vrs) 

a s 

1 

0 

/ Vrs 

a h m 

0 

0 

Um — yms(u r /y rs ) 

e 

Vl v 2 ... 0 ... v n 

1 

Z V s (u r ! Vrs) 


Table 2.4: Moving from one basic column to a new one. 


Suppose that it has been decided which new column a s should be included in the basis. 
Then part (i) is important for it determines which basis element a/, r should be replaced. 
Part (ii) states that on pivoting at y rs the same basic feasible solution is obtained only if 
the original basic feasible solution was degenerate. 

2.17. Example. Minimise —4xi + 3 x 2 subject to 


Xi - 2x 2 

> 

-4, 

2xi + 3x 2 

< 

13, 

1 

H 

to 

< 

4, 

Xl,X 2 

> 

0 . 


Solution Table 2.5 shows three tableaus for this problem. 

The first Tj is obtained from the related standard form and has basic feasible solution 
(0,0,4,13,4; 0). Pivoting about the position y 3 i(= 1) gives tableau T 2 with basic feasible 
solution (4, 0, 8 , 5,0; —16). Pivoting about the position r/ 22 (= 5) then gives tableau T 3 with 
basic feasible solution (5,1, 7, 0,0; —17). The asterisks indicate the pivot elements. 

We now turn to the problem of deciding which new column a s should be included in 
the basis. 

2.18. Definition. The entry v s in the bottom row of a tableau has the proper sign for 
improvement if either v s < 0 in a maximising problem, or v s > 0 in a minimising problem. 










22 


CHAPTER 2. THE ONE-PHASE SIMPLEX ALGORITHM 



a i 

a 2 

a 3 

a 4 

a 5 

e 

b 

a 3 

-1 

2 

1 

0 

0 

0 

4 

a 4 

2 

3 

0 

1 

0 

0 

13 

a 5 

1 * 

-1 

0 

0 

1 

0 

4 

e 

4 

-3 

0 

0 

0 

1 

0 

a 3 

0 

1 

1 

0 

1 

0 

8 

a 4 

0 

5* 

0 

1 

-2 

0 

5 

a i 

1 

-1 

0 

0 

1 

0 

4 

e 

0 

1 

0 

0 

-4 

1 

-16 

a 3 

0 

0 

1 

-1/5 

7/5 

0 

7 

a 2 

0 

1 

0 

1/5 

-2/5 

0 

1 

a l 

1 

0 

0 

1/5 

3/5 

0 

5 

e 

0 

0 

0 

-1/5 

-18/5 

1 

-17 


Table 2.5: Three tableaus for Example 2.17. 


2.19. Theorem. Suppose that, in a tableau with corresponding objective value z, the entry 
v s has the proper sign for improvement and that on pivoting at y rs a new basic feasible 
solution is obtained with objective value z'. Then in a maximising problem we have z' > z, 
and in a minimising problem we have z' < z. 


Proof. We refer back to the second tableau in Table 2.4, and note that the value in the cell 
in the bottom right of the table is the value z of the objective function, because e remains 
in the basis of the column space. Writing z' for the new value of the objective function, 
and noting that we always have u r /y rs > 0, we have 

z' < z iff v s > 0, while z' > z if v s > 0. 

Further, if we assume we are moving to a new distinct basic feasible solution, so can assume 
that u r /y rs > 0, then 

z' < z iff v s > 0, while z' > z if v s > 0. 


□ 


This theorem ensures that we choose a column that improves the objective value. We 
are now in a position to describe the so-called ’one-phase’ simplex algorithm. 


2.7 The One-phase Simplex Algorithm 

Suppose that T is a tableau such that the associated basic solution is feasible. (It may be 
necessary to use the second phase of the algorithm, described below in Section 4.2, to obtain 
such a tableau.) Choose, if possible, an s such that v s has the proper sign for improvement. 
Then choose, if possible, an r such that another (not necessarily new) basic feasible solution 
is obtained on replacing a h r by a s in the basis by pivoting about y rs . Continue the process 
as long as possible. 

Note that we never try to pivot in the same column twice, since after a pivot, the 
corresponding v s no longer has a proper sign for improvement. Thus even if we don’t get a 












2. 7. THE ONE-PHASE SIMPLEX ALGORITHM 


23 


new basic feasible solution, the algorithm does not stall, by repeating the same choice each 
time. 

In this description, there are two occurrences of “if possible”. The process is bound to 
terminate at one of these points. If there is no v s with the proper sign for improvement 
then an optimal solution has been obtained. If on the other hand it is not possible to find a 
suitable r then the problem has no optimal solution. At this stage, we have not yet proved 
these claims; we consider them further in Section 2.8, where we establish that provided the 
process terminates, it does so as claimed here. Table 2.1 gives a flow-chart that illustrates 
and amplifies the description just given. 


START 


V 


Is there an s such that vs has 
the proper sign for improvment? 


I 


No 


V Yes 


Box A 

An optimal solution has been| 
found. The optimal value is 
in the last column 
STOP 


Choose such an s 



1 

N° 

Box B 

There is no optimal solution 
Stop 

Is there an r with yrs > 0 ? 





Yes 


Choose r to minimise ur/yrs 
subject to yrs > 0 


T 


i 


Pivot at yrs to form a new tableau 
with a_s replacing ahr in the basis! 


1 


Figure 2.1: Flow chart for one-phase simplex algorithm. 


2.20. Example. Maximise 4xi + 3x2 subject to 


3x± + 4x 2 

< 

12, 

7xi + 2x 2 

< 

14, 

Xl,X 2 

> 

0 . 


Solution Table 2.6 shows the tableaus obtained by following the algorithm. The tableau T\ 
is obtained from the related standard form and has basic feasible solution (0, 0,12,14; 0). 
Pivoting about 2/12(= 4) gives tableau T 2 with basic feasible solution (0,3,0,8; 9). Pivoting 
about y 2 i(= 22/4) then gives T3 with basic feasible solution (16/11,21/11,0,0,127/11). 
There is now no v s with the proper sign for improvement. We have an optimal solution of 
the original problem x\ = 16/11, X 2 = 21/11 with optimal value 127/11. 

































24 


CHAPTER 2. THE ONE-PHASE SIMPLEX ALGORITHM 


T x 


T 2 


t 3 


2.21. Example. Minimise —4xi + 3x2 subject to 

x\ - 2 x 2 > -4, 

2xi + 3x2 < 13, 
xi - x 2 < 4, 
xi,x 2 > 0. 

This example was studied in Section 2.6 and the tableaus given there in Table 2.5 were 
obtained by following the algorithm. In T 3 , of Table 2.5 no v s has the proper sign for 
improvement, so an optimal solution of the original problem is x\ = 5, x 2 = 1 with optimal 
value —17. 



a 4 

a 2 

&3 

a 4 

e 

b 

&3 

3 

4* 

1 

0 

0 

12 

a 4 

7 

2 

0 

1 

0 

14 

e 

-4 

-3 

0 

0 

1 

0 

a 2 

3/4 

1 

1/4 

0 

0 

3 

a 4 

22*/4 

0 

-2/4 

1 

0 

8 

e 

-7/4 

0 

3/4 

0 

1 

9 

a 2 

0 

1 

7/22 

-3/22 

0 

21/11 

ai 

1 

0 

-1/11 

2/11 

0 

16/11 

e 

0 

0 

13/22 

7/22 

1 

127/11 


Table 2.6: Example 2.20: three tableaus. 


2.8 A partial proof 

To be sure that the description given is a genuine algorithm, it is necessary to fill in three 
gaps. We must prove that 

1. the claim relating to Box A is true; 

2. the claim relating to Box B is true; and 

3. the process is bound to terminate after a finite number of steps. 

We can deal completely with (1) and (2) but will see that difficulties arise in trying to 
establish (3). 

2.22. Theorem. If in a tableau there is no v s with the proper sign for improvement then 
the corresponding basic feasible solution is optimal. 

Proof. Suppose for clarity that the original problem is a maximising problem, and that our 
final tableau exhibits a basic feasible solution x = (xi,... , x n ;7i+i)- Assume also that 
none of the Vj's exhibits the proper sign for improvement, so they are all positive, and our 
algorithm has halted in Box A. Note that, since we have a basic feasible solution, we have 
x n+ i = 2. 












2.8. A PARTIAL PROOF 


25 


This tableau is row-equivalent to the original problem, and thus represents a system 
of equations with the same solutions as the initial set of equations. Suppose we have any 
other feasible solution x of the initial system Ax = b — not necessarily a basic one. In 
particular, x gives a solution of the last equation in the tableau, and so 

n n 

VjXj + x n+ i = z and so x n +i ~ Xn+i = v j x j- 
3 =1 3 = 1 

By our assumption, all the v/s are positive, and each Xj > 0, because this is a feasible 
solution. Thus x n+ \ > x n+ ±, and so x n +\ = z is the optimal solution. □ 

This establishes (1), the claim relating to Box A. 

2.23. Theorem. Suppose that in a tableau there is an s such that v s has the proper sign 
for improvement and y rs < 0 for all r = 1,2,... , m. Then there is no optimal solution. 

Proof. We use linearity crucially here. Specifically, we use the fact that adding any solution 
of the equation Ax = 0 to a solution of the equation Ax = b gives another solution of the 
equation Ax = b. 

Assume again for clarity that we are solving a maximising problem, that we have found 
a column such that v s < 0, but that for every r, we have y rs < 0. Then since this column 
expresses a s in terms of the current basis of the column space, we have 

m 

a s = yis&hi + v s e. 

i=l 

Re-arranging gives 

m 

a s + y~^(~Z/is) a fe, + (—Vs) e = 0 . (2-1) 

i= 1 

Note that in this form of the equation, all the coefficients are positive. 

Now let the current basic feasible solution be (x±,X 2 , ... ,x n \z), so that 

m 

^ 2 xia hi +ze = b. (2.2) 

i= 1 

Adding a multiple d of equation (2.1) to equation (2.2) then gives 

m 

6 a s + ^~^(xi - dy is )a h . + (z - dv s )e = b 
i= 1 

and this gives a family of feasible solutions with objective values z' = z — dv s for every 

i? > 0. □ 

This establishes (2), the claim relating to Box B. In these circumstances there is no 
optimal solution in that the feasible region is unbounded and an arbitrarily large (positive 
if maximising, negative if minimising) objective value is possible. Indeed, as we have shown, 
one can find a family x(6) of feasible solutions such that, as 9 —> oo, the objective value 
f(x(6)) tends to +oo (if maximising) or —oo (if minimising). 


26 


CHAPTER 2. THE ONE-PHASE SIMPLEX ALGORITHM 


2.24■ Example. Maximise 2x\ + X 2 subject to 

-X1+X2 < 1 , 

xi - 2x2 < 2 , 

xi,x 2 > 0. 

Table 2.7 shows the tableaus obtained by following the algorithm. The tableau T\ 
is obtained from the related standard form and has basic feasible solution (0,0,1,2; 0). 
Pivoting about y 2 1(= 1) gives tableau T 2 with basic feasible solution (2,0,3,0; 4). Now 
V2 (= —5) has the proper sign for improvement but there is no r such that y r2 > 0. Thus 
there is no optimal solution. 


T\ 


T 2 



a i 

a 2 

a 3 

3.4 

e 

b 

a 3 

-1 

1 

1 

0 

0 

1 

9.4 

1* 

-2 

0 

1 

0 

2 

e 

-2 

-1 

0 

0 

1 

0 

a 3 

0 

-1 

1 

1 

0 

3 

a i 

1 

-2 

0 

1 

0 

2 

e 

0 

-5 

0 

2 

1 

4 


Table 2.7: Tableaus showing that Example 2.24 has no optimal solution. 

From the method used in the proof of the preceding theorem, we find that 

(2 + 26, 9, 3 + 9, 0; 4 + 50) 

is a feasible solution for all 9 > 0. Thus x\ = 2 + 29, x 2 = 9 is a feasible solution of the 
original problem with objective value 4+5 9. These form a family x(9) of the kind described 
above. 

2.25. Theorem. If every basic feasible solution of a Linear Programming Problem is non¬ 
degenerate then the process of following the simplex method does terminate after a finite 
number of steps. 

Proof. It is enough to show that our algorithm terminates. However, except in the case 
of degeneracy, we move to a different basic feasible solution each time, and cannot return 
to one already visited, since the objective function strictly improves. But, by Lemma 2.3, 
there are only a finite number of distinct bases for the column space, so termination must 
occur after this many steps. □ 

When there is degeneracy, there is the possibility that the simplex method will return 
to a previous basic feasible solution and then go round and round in an endless repetition, 
a process known as cycling. It is said to be a fact, however, that although degeneracy is 
quite common in practice, cycling has occurred only in artificially constructed examples; 
however there is a claim that one such problem occurs in the solution of a practical queueing 
model (Kolman & Beck 1995, Page 127). 










2.9. QUESTIONS 2 


27 


2.26. Example. Maximise |xi — 20 x 2 + ^3 — 6 x 4 subject to 

\x\ - 8 x 2 - x 3 + 9x 4 + x 5 = 0 

\x\ — 12 x 2 — 5 X 3 + 3 X 4 + 3^6 = 0 

X 3 + X7 = 1 

Xi,X 2 ,... ,X 7 > 0 . 

Solution It is easy but tedious 2 to check that cycling occurs if the following sequence of 
bases is taken: {a 5 ,ag,a 7 }, {ai,ag,a 7 }, {ai,a 2 ,a 7 }, {a 3 ,a 2 ,a 7 }, {a 3 ,a 4 ,a 7 }, {a 5 ,a 4 ,a 7 }, 
{a 5 ,a 6 ,a 7 }. However, by taking the sequence {a, 5 ,ag,a 7 }, {a 5 ,ai,a 7 }, {a 5 ,ai,a 3 }, the 
process can be made to terminate in the optimal basic feasible solution (1, 0,1,0, 0, 0; |). 

We show how to check this by machine in Section 3.4 

The following result, though positive, is of only theoretical interest: 

2.27. Theorem. It is always possible, starting from any given basis with an associated 
basic feasible solution, to choose a sequence of bases so that the one-phase simplex method 
terminates. 

In practice something called Bland’s Rule (Kolrnan & Beck 1995) can implement the 
above and so avoid the possibility of cycling. It can be shown that if the choices made 
during the algorithm are done in an ordered way, then cycling is avoided. The rule is as 
follows 

• when there is a choice of pivot column, always choose the one with the lowest index 
— the one furthest to the left; and 

• when there is a choice of pivot row, always prefer the row labelled by the basic variable 
with the smallest subscript. 


2.9 Questions 2 (Hints and solutions start on page 109.) 

2.1. Q. Use the Simplex Algorithm to maximise 5xi + 2x2, subject to Xi,x 2 > 0 and 

—Xi + x 2 > —2, 

2xi + x 2 < 7. 

Either give the maximum value of the objective function, and values of X\ and x 2 at which this 
maximum is obtained, or explain briefly why there is no maximum. 

2.2. Q. Use the Simplex Algorithm to maximise —3xi +4x 2 , subject to Xi,x 2 >0 and 

2 xi — x 2 > —5, 

Xi + 3 x 2 < 22. 


Either give the maximum value of the objective function, and values of X\ and x 2 at which this 
maximum is obtained, or explain briefly why there is no maximum. 


2 Thus an ideal place to use Maple. 



28 


CHAPTER 2. THE ONE-PHASE SIMPLEX ALGORITHM 


2.3. Q. Use the Simplex Algorithm to maximise 3xi — X 2 , subject to xi,X 2 > 0 and 

—X\ + 2x 2 > —5, 

2xi + X 2 < 15. 

Either give the maximum value of the objective function, and values of X\ and X 2 at which this 
maximum is obtained, or explain briefly why there is no maximum. 

2.4. Q. Use the Simplex Algorithm to minimise —x± — 3x2 + £ 3 , subject to xi,X 2 ,X 3 > 0 and 

3x\ — 2^2 + 3 x 3 < 3, 

2xi — X 2 — 6 x 3 > —5. 

Either give the minimum value of the objective function and the values of Xi, X 2 and X 3 at which 
this minimum is obtained, or explain briefly why there is no minimum. 

2.5. Q. Let V be a finite dimensional vector space with basis {vi, V 2 ... ,v„}. Given u £ V, 

u = A 1 V 1 + A 2 v 2 H- 1 - A„v„ say, 

state a necessary and sufficient condition on the coefficient Ai that {u, V 2 ..., v„} is a basis for V. 
Prove that your statement is correct. 


Chapter 3 

Using Maple 


In this Chapter we have two rather different aims. First we show how the Maple package 
can be used to check a lot of the algebra that you will be doing. It should thus provide 
a rather easy introduction to symbolic algebra in general, and bring immediate rewards if 
your pivoting is not as accurate as it might be. The second aim is to give an alternative 
view of the simplex algorithm, which may make its relationship to equation solving clearer. 
Maple uses this second view of the process, and although it is significantly longer to write 
out than our standard version of the algorithm, another view is often helpful. Finally we 
use the package to verify the result on cycling described in Example 2.26, something that 
is rather tedious to check by hand. 

We describe briefly the procedures necessary to run a Maple session. Much of this will 
be familiar to anyone who has used the PC classrooms already. Even if you are familiar 
with the logging on sequence, you should still read through this as there are certain specific 
details which are required to get Maple working. 


3.1 Getting Access to Maple 

In Aberdeen Maple runs both on PC Windows operating systems and on Unix executing 
on the machine known as sysa. These instructions are for using the Unix version of Maple 
on sysa connecting from a PC via the X Windows system. 

3.1.1 Using Maple in a PC Classroom 

We give brief instructions for starting the Maple application, applicable to most of the PC 
classrooms. 1 There are two steps to the process; getting access to the Unix machine “sysa”, 
and then running Maple on sysa. In this section we describe how to get access to sysa. 

1. Log-on, select the ‘Science & Engineering’ folder by double clicking it (it should 
be on the left hand side of the screen) and then select the ‘Mathematical Sciences’ 
folder. 


’in previous years it has been possible to use Maple in the class library. This is not so at present, but 
we hope it will be available again next year. 


29 



30 


CHAPTER 3. USING MAPLE 


2. Select the yellow X icon labelled exceed to start X-Windows. You will find an old 
Maple icon next to this; please don’t try to use Maple by selecting it; that version 
of the software is out of date. 

3. If you are given a choice of servers, select sysa to manage your X-windows session; 

4. You will now have a sysa login dialog displayed. The next step is to run Maple as 
described in Section 3.1.2. 

5. Logging out. Remember the need to log out before you leave the PC. After quitting 
Maple, select the ‘start’ button (lower left of the screen), select ‘shut down’, then 
select ‘close all programs ... ’ and click OK. 

You can also get access to sysa from the Common Applications folder going via Internet 
Options. The resulting xterrn will not behave well; rather, as described above, you should 
access it from the Mathematical Sciences folder to get useful keyboard customisations. 

3.1.2 Running Maple on sysa 

Once you have got a login prompt for sysa here is how to continue. 

1. Enter your username and password; note that you press the return key after each 
entry. An xterm window will appear. 

2. If you are asked to choose a terminal type, select option 4, and, when prompted for 
your terminal type, type: xterm. You should then have a Unix command prompt in 
an xterm window. 

3. The first thing to do is say what software you intend to use, so that some customisa¬ 
tions can be made to your login session. Type: use maple7. You will need to repeat 
this step each time you log on. 

4. If the previous step worked, it will have resulted in a one-line message; otherwise 
there will be additional instructions on the screen which you should follow. 

5. (First time only) If this is going to be your first time using Maple, I suggest you 
create a directory called mx3503. To do this type: mkdir mx3503. 

6. Change to your mx3503 directory by typing: cd mx3503. This is where your Maple 
files will be kept. Every time you log on to work with Maple for this course you must 
chnage to this directory before you start. 

7. Now you can start the Maple software with the ‘xmaple’ command. Type: xmaple & 

8. When you are finished, quit Maple and then log out of X-windows by typing: exit 
in your xterm window. 

9. You should now log out of the PC. Refer back to the appropriate section above for 
instructions. 

You really have no use for the command line prompt except to start Maple. In fact the 
prompt will behave rather like the bash prompt you may have met in the JAVA course; 
indeed if you type “bash” the behaviour is extremely similar, but in this case you will be 
running on a native Unix system. 


3.2. USING MAPLE TO CHECK RESULTS 


31 


3.1.3 Saving your work 

You can save your work between sessions by selecting File -> Save; in other words, by 
choosing the “Save option from the “File” menu. Note that there is no default file name 
offered; you are shown the pattern *.mws, meaning you are expected to use a name like 
myfile.mws with the given suffix, but you need to alter the * as appropriate. You may 
wish to save space by selecting Edit -> Remove Output before saving, since you can always 
re-run the sheet to regenerate the output. 

If you started Maple by first changing to your mx3503 directory, you should find that 
Maple saves your worksheets, by default, in that directory. 

3.1.4 Printing From Maple 

When you try to print from Maple, you will be presented with a dialog box. The first time 
you print, you must set the appropriate ‘Print Command’ and ‘Paper Size’, in the print 
dialog box. These settings will be remembered by Maple for the next time you use it. 

Print Command In the Print dialog box, you should select ‘Print Command’ and replace 
whatever is there with lpr -Phold to send your output to one of the other classrooms’ laser 
printers. 

Paper Size In the ‘paper size’ section you should select ‘A4’. 

3.2 Using Maple to Check Results 

As our first example, we consider again the minimisation problem given in Example 2.17. 
We show what is almost a transcription of the session with Maple; note that Maple 
commands appear after the prompt >, while the responses, or answers, appear more or less 
in the centre of the line or lines below: 

> with(simplex): 

The package to do the manipulations required by the simplex algorithm is not available 
to Maple unless you ask for it. We ask for the package with the command with(simplex); 
and the response shows the new commands available. You can probably guess what some 
of them do. To find out more, type Tsimplex; and a help screen will appear. To get 
information about the individual commands, you can query them within the package; thus 
Tsimplex,setup gives information about the setup command. Before going any further, 
I note that the syntax is important. Any Maple command ends with a semicolon 
without this punctuation the system will wait for further input. A second syntactic point 
is about assignment. The command x:=2 assigns x the value 2. In contrast the statement 
x=2 is a predicate; it has the value true or false depending on the value of x before the test, 
and the value of x is not affected by the test. This is the same as Pascal, but different from 
the FORTRAN convention, in which the = on its own is used for assignment. To make the 
point more strongly, you can check the following conversation with Maple. 

> x=2; 

x — 2 


> x; 


32 


CHAPTER 3. USING MAPLE 


x 


Another point to note concerns “unassigning” variables. If you have been manipulating 
the variable x, and wish to re-run some commands, you should unassign x before doing so 
with the command x : = J x J . If this is not done, you may find that x is contaminated by 
a previous run. 

To return to the simplex algorithm, we specify the constraints and set up the problem: 

> constraints: = [xl-2*x2 >= -4, 2*xl + 3*x2 <= 13, xl-x2 <=4]; 

constraints := [—4 < xl — 2x2, 2x1 +3x2 < 13, xl — x2 < 4] 

We give the constraints in an arbitrary form, as a list, with the individual constraints 
delimited by commas. All the documentation suggests that constraints can be specified 
either as a list (with square brackets thus [a,b, c]) or as a set thus {a ,b, c}. In practice 
there seems to be a bug in this version of the code, Maple Version 5.3 which means that 
sets are not always handled properly. The difference between the two types is that in a list, 
the elements are ordered, while in a set, they are not; it seems that some code was relying 
on the fact that a set of ratios built from a set of constraints was ordered in the same way as 
the constraint set. You are thus advised only to use lists for constraints until the problem 
is solved. For reference, given lists LI = [a,b] and L2 = [c,d], the list L = [a,b,c,d] 
is made by 

LI := [a,b]; L2 := [c,d]; L:=[op(Ll), op(L2)]; 

More generally, note that the comma separator needs to be used when creating or manip¬ 
ulating both sets and lists. 

If you have a set, say a:= {x, y, z>, then the corresponding list is made by 
convert(a,list); 

which produces [x y z] . The op used above selects operands; all of them when we used it 
to get the elements of the lists LI and L2; more generally, op (2, L) is just b. 

Given now a list of constraints, it can then be converted to the standard < form for 
subsequent processing. 

> c:=convert(constraints,stdle); 

c := [—xl + 2 x2 < 4, 2 xl + 3 x2 < 13, xl — x2 < 4] 

> c:=setup(c,NONNEGATIVE); 

c : = [_SL1 = 4 + xl - 2x2, SL2 = 13-2x1 -3x2, J SL3 = 4 - xl + x2] 

The problem is now set up, and Maple automatically introduces slack variables, which 
it calls _SL1, _SL2, _SL3 to convert the inequalities to equalities. 

> basis(c); 

[-SL1, _SL2, SL3} 

> obj:=-4*xl+3*x2; 

obj := —4x1 +3x2 


3.2. USING MAPLE TO CHECK RESULTS 


33 


The basis of the set of constraints is another way of describing the set of basic columns 
in the first tableau of Table 2.5. Our aim is to minimise the objective function which we 
call obj. Note that, because we are dealing with the 1-phase problem, we have a basic 
feasible solution already; the rather boring one x\ = X 2 = 0. 

> pivotvar(-obj); 

xl 


> ratio(c,xl) ; 



Now we set about improving the objective function. The command pivotvar only has 
the concept of a maximising problem, so we try to maximise -obj. In any case the idea is 
clear; to decrease the objective function, while keeping x± and X 2 non-negative, we clearly 
start by working on x\, rather than X 2 - The ratio command looks in the last column of 
the tableau to see which equation to swap out of the basis. Recalling Theorem 2.16, we 
want the smallest positive ratio, provided it is non-zero. Maple chooses to report negative 
ratios as oo so there can be no confusion, and we see that we should swap out the third 
equation, corresponding to a ratio of 4. 

> e:=pivoteqn(c,xl) ; 

e := [-SL3 = 4 — xl + x2] 

> c:=pivot(c,xl,e); 

c := [-SL1 = 8 - JSL3 - x2, SL2 =5 + 2 _SL3 -5x2, xl = -J5LS + 4 + x2] 

> basis(c); 

[-SL1, _SL2 , xl] 

We have now set e to the equation to be swapped out (the command pivoteqn has 
used the command ratio itself), and done the pivot, to put x\ in the basis; we see that 
_SL3 has come out of the basis. The constraints are then re-arranged so that the basis 
variables appear on the left hand side of the constraint equations; this is the new value of 
the constraint set c. Initially, we could set x\ and X 2 as we wished, and the slack variables 
were determined by these choices. Now our “free” variables are X 2 and _SL3, and the others 
are then determined by the constraints. 

> obj:=subs(c,obj); 

obj := 4 _SL3 — 16 — x2 

Before going on, we express the objective in terms of the new “free” variables, and 
then repeat the process. The command pivotvar reports Fail if there is no proper sign for 
improvement. In our case, we can continue as follows: 

> pivotvar(-obj); 

x2 


> 


> 


ratio(c,x2) ; 


[8, 1, oo] 


e:=pivoteqn(c,x2); 

e := [. SL2 = 5 + 2 _SL3 


5x2] 


> 


c:=pivot(c,x2,e); 


34 


CHAPTER 3. USING MAPLE 


c := [_SL1 


= 7-1 -SL3 + - -SL2 , x2 

5 5 


--J5L2 + 1 + --SL3, xl 

5 5 


--SL3 + 5- -JSL2 
5 5 


> basis(c); 


[. -SL1 , x2, xl] 

> obj:=subs(c,obj); 

18 1 

obj := — -SL3 - 17 +-.SL2 
5 5 

In fact the algorithm has now terminated; the objective function has no proper sign for 
improvement. To find the “answers”, we substitute zero for each of the free (or non-basis) 
variables: 

> free:=[_SL2=0,_SL3=0]; 


free := [_ SL2 = 0, _ SL3 = 0] 

> subs(free,obj); 

-17 


> subs(free,c); 

[SL1 = 7, x2 = 1, xl =5] 

And of course we get the same solution as before. 

3.1. Example. Give a Maple solution for Example 2.20. 

Solution We give the dialogue with Maple without further comment. 

> with(simplex): 

> c:=[3*xl + 4*x2 <=12, 7*xl+2*x2 <=14]; 

c := [3 xl + 4 x2 < 12, 7 xl + 2 x2 < 14] 

> c:=setup(c,NONNEGATIVE); 

c := [_SL1 =12-3 xl - 4 x2, -SL2 = 14 - 7x1 -2x2] 

> obj:=4*xl+3*x2; 

obj : = 4 xl +3x2 

> pivotvar(obj); 

xl 


> 


> 


> 


> 


> 


ratio(c,xl); 


[4, 2] 


c:=pivot(c,xl,pivoteqn(c,xl)); 

c := [-SL1 = 
obj:=subs(c,obj); 


3 22 1 

c := LSL1 = 6 + - SL2 —— x2, xl = — — SL2 + 2 
l 7 7 7 


pivotvar(obj); 
ratio(c,x2); 


4 13 

obj := — - -SL2 + 8 + — x2 
7 7 


x2 

r 21 


L 11 


, 7 ] 



3.3. PIVOTING USING MAPLE 


35 


> c:=pivot(c,x2,pivoteqn(c,x2)); 

c := [x2 = — — -Si 
1 22 

> obj:=subs(c,obj); 


7 21 3 2 16 1 

c := [x2 = - -SL1 + — + — SL2, xl = - SL2 + — + — -SL1] 

1 22 11 22 ’ 11 11 11 J 


obj := - —SL2 + — - — _SL1 
' 22 11 22 

> free:=[_SL1=0,_SL2=0]; 

free := [SL1 = 0, SL2 = 0] 

> subs(free,obj); 

127 

IT 

> subs(free,c); 

[x2 = —,xl = —] 

1 11 11 J 


It is not necessary to work through the example by hand as we have just done: if you 
are just interested in the answer, then the built in commands of maximize and minimize 
give the answer almost immediately. Here is a minimal version of the first example: 

> with(simplex): 

> c:=[xl-2*x2 >= -4, 2*xl + 3*x2 <=13, xl-x2 <=4]; 

c := [—4 < xl — 2 x2 , 2 xl +3x2 <13 , xl — x2 < 4] 

> c:=setup(c); 

c : = [_SL1 = 4 + xl - 2 x2, -SL2 = 13-2x1 -3x2, SL3 = 4 - xl + x2] 

> obj:=-4*xl+3*x2; 

obj := —4x1 +3x2 

> vars:=minimize(obj,c,NONNEGATIVE); 

vars := {SL3 = 0, SL2 = 0, SL1 = 7, x2 = 1, xl = 5} 

> subs(vars,obj); 

-17 

It is perhaps appropriate to note two points here. Even though we work with lists, 
minimize returns a set. And minimize is written exactly as given here; the function 
minimise is undefined! 


3.3 Pivoting Using Maple 

An alternative way of using Maple is simply to let it do the pivoting you would otherwise 
do by hand. Here is what happens if we do the first example using the tableau method 
we have described above, making the same decisions about which rows to swap in and 
out, and simply using Maple to do the pivoting. As you will see, the main problem is in 
constructing the tableau in the first place! 

> with(linalg): 

We don’t need the simplex package; the more general purpose, but much larger linear 
algebra package, declared as shown, has what we need; in fact it overrides some of the defin¬ 
itions in the simplex package. The linear algebra package contains all the above routines 


36 


CHAPTER 3. USING MAPLE 


in it. Recall you can get help on any routine with the command eg ? linalg,pivot. You 
can probably guess enough to ask sensible questions of the help system; you may find you 
can get enough information in this way to be able to use Maple for work in other subjects. 
We first set up the tableau. 


> A:=concat(matrix(3,2, [-l,2,2,3,l, _ l]),diag(l,l,l)); 


A : = 


-1 2100 

2 3 0 1 0 

1-1001 


> B:=concat(A,concat(vector([0,0,0]),vector([4,13,4]))); 


B : = 


-1 210004 

2 3 0 1 0 0 13 

1-10 0 10 4 


> obj:=vector([4,-3,0,0,0,1,0]); 

obj := [4, -3, 0, 0, 0, 1, 0] 


> T1:=stackmatrix(A,obj); 

' -1 2 1 0 0 0 4 

2 3 0 1 0 0 13 

1-10 0 10 4 

4 -3 0 0 0 1 0 _ 

We have now made the first of the tableaus in Table 2.5. The calculation continues 
exactly as there; all we get Maple to do is perform the pivoting operations. 


> T2:=pivot(T1,3,1); 

' 0 

1 

1 

0 

1 

0 

8 

T2 : = 

0 

5 

0 

1 

-2 

0 

5 

1 

-1 

0 

0 

1 

0 

4 


_ 0 

1 

0 

0 

-4 

1 

-16 

> T3:=pivot(T2,2,2); 









T3 : = 


0 0 1 — 
5 

0 5 0 1 

10 0 - 

5 

0 0 0 ^ 

5 


7 

5 

-2 

3 

5 

-18 

~ 5 ~ 


And of course we end up with the same tableau - 
normalise a row before using it to pivot. 


0 7 

0 5 

0 5 

1 -17 

except that Maple chooses not to 


3.4 Cycling in Example 2.26 

In section 2.8 we asserted that the algorithm could cycle, given a particular choice of basis 
sequence. We now verify this using Maple. 

> with(linalg) : 













3.4. CYCLING IN EXAMPLE 2.26 


37 


array([[1/4,-8,- 

1,9] , 

[1/2,- 

-12,- 

1/2,3] 

,[o 

,0,1 



1 

-8 

-1 

9 


M : = 

1 

2 

-12 

-1 

IT 

3 




0 

0 

1 

0 




-1 
. T 

20 

-1 

IT 

6 


=concat(M,diag(l,1,1,1)); 






' 1 

4 

-8 

-1 

9 1 

0 

0 0 

MI := 

1 

2 

-12 

-1 

3 0 

1 

0 0 


0 

0 

1 

0 0 

0 

1 0 


-1 

A 

20 

-1 

o 

6 0 

0 

0 1 


> b:=vector([0,0,1,0]); 


b := [0, 0, 1, 0] 


> A:=concat(MI,b);B:=copy(A): 

1 


A : = 


4 

-8 

-1 

9 

1 

0 

0 

0 

0 

1 

2 

-12 

-1 

IT 

3 

0 

1 

0 

0 

0 

0 

0 

1 

0 

0 

0 

1 

0 

1 

-1 


-1 








— 20 — 600010 
4 2 

You might have expected to store a copy of the matrix A in B by using the command 
B: =A;, but this would not have done what you wished. It would have made B into a pointer 
which always gave you the current value of A, even when it had altered. Since I am trying 
to show that after a number of changes A comes back to where it started, I need to copy 
the original value of A into B. 


> A:=pivot(A,1,1) 


A : = 


1 

4 

0 


-8 -1 
4 


0 0 
0 12 


3 

2 

1 

-3 

~ 2 ~ 


9 1 0 0 0 0 

-15 -2 1 0 0 0 

0 0 0 1 0 1 

15 10 0 10 


> A:=pivot(A,2,2); 










38 


CHAPTER 3. USING MAPLE 


> 


> 


> 


> 


> 


A := 


A:=pivot(A,1,3); 


1 


0 


4 

0 4 


-21 -3 2 0 0 


-15 -2 


0 0 1 
0 0-6 


0 

60 


0 

7 


1 0 0 

0 1 0 
-3 0 1 


A : = 


A:=pivot(A,2,4); 


A : = 


1 

4 

-3 

16" 

-1 

IT 

3 

L 4 

-5 

-3 

16 

5 
2 
0 


0 2 
4 0 
0 0 
0 0 


-21 -3 2 0 0 


3 1 

4 4 

21 3 

T 2 

-3 -2 


112 2 0 4 

4 0 | I 
4 4 

-56 0 0 -2 

16 0 0 -1 


^ 0 0 


-1 1 0 
3 0 1 


-12 0 0 

o „ 


6 

1 


1 0 
0 1 


A:=pivot(A,1,5); 


A : = 


-5 112 

1 

- -3 — 


0 

-5 

L ~4~ 


0 

44 


2 

-1 

8~ 

1 

1 

2 


0 4 -12 0 0 


!» 
0 0 
0 0 


I 0 0 

0 1 0 

-2 0 1 


A:=pivot(A,2,6); 


A : = 


0 

-1 

~T 


-32 -4 36 4 

-1 3 


-3 

0 

20 


8 

1 

-1 


4 ° 
0 0 

6 0 


0 0 0 

loo 

0 1 0 
0 0 1 


A:=mulrow(A,1,1/4):A:=mulrow(A,2,4); 

1 


A : = 


-8-191000 


1 -1 

- -12 — 3 0 

2 2 


0 

-1 

~T 


o 

20 


1 

-1 


0 0 
6 0 


1 0 0 

0 1 0 
0 0 1 


0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

0 

0 

0 

1 

0 














3.5. AN EXTENDED EXERCISE 


39 


> matadd(A,B,1 ,-l) ; 


' 0 0 0 0 0 0 0 

0 0 0 0 0 0 0 

0 0 0 0 0 0 0 

0 0 0 0 0 0 0 


0 0 ' 
0 0 
0 0 
0 0 


Thus we have completed a cycle. Note that A and B were assigned the same value, so 


this is a check that we have returned to where we started. If instead I had done B: =A, it 
would have shown nothing, because the value of B would have been the “current” value 


of A. 


Cycling occurred when we made a particular choice of pivot. Starting in a different way 
leads to convergence. 


> A:=pivot(A,2,1) ; 

0 

-2 

-3 15 

X Y 

-1 

X 

0 

0 

0 


A := 

1 

2 

-12 

-1 

T 3 0 

1 

0 

0 

0 



0 

0 

1 0 0 

0 

1 

0 

1 



0 

14 

-3 15 

x y 0 

1 

2 

0 

1 

0 


> A:=pivot(A,3,3); 

0 

-2 

15 

0 y 1 

-1 

X 

3 

4 

0 

3 ' 

4 


A : = 

1 

2 

-12 

0 3 0 

1 

1 

2 

0 

1 

2 



0 

0 

1 0 0 

0 

1 

0 

1 



0 

14 

o 

o 

1 

2 

3 

4 

1 

3 

4 . 


> A:=mulrow(A,2,2); 

0 

-2 

15 

0 T 1 

-1 

X 

3 

4 

0 

3 ' 

4 


A := 

1 

-24 

0 6 0 

2 

1 

0 

1 



0 

0 

1 0 0 

0 

1 

0 

1 



0 

14 

O 

O 

1 

2 

3 

4 

1 

3 

4 . 


Having pivoted about (3,3), 

we see 

there was no proper 

sign for improvement; thus we 


can read off the maximum value: 3/4 as claimed. 


3.5 An Extended Exercise 

You are now in a position to tackle completely the request from the board of MMM described 
originally in 1.3. You are reminded that this is to take the form of the report that you, 
as their consultant, deliver to the board, recommending their course of action with respect 
to the three new alloys. I’m sure that writing such a report will impose significantly on 
your already busy schedule; however this type of tight timing also occurs in practice, so 
this is not an inappropriate simulation. Since the point of this exercise is mainly one of 










40 


CHAPTER 3. USING MAPLE 


presentation and the use of Maple, rather than in solving the problem, I’ve made available 
one Maple “solution” to the problem on the web 


Chapter 4 


The Two-Phase Simplex Algorithm 


4.1 Introduction 

In the one-phase simplex algorithm it is assumed that an initial basic feasible solution for 
the related standard form linear programming problem is either given or obvious. 

Suppose, for instance, that the original problem is: maximise x n+ \ subject to 

Ax < b and X\,X 2 , - ■ ■ ,x n >0 

where b > 0. This is the particular case when the constraints can all be written as “less than 
or equals” and the right-hand sides turn out to be greater than or equal to 0. The related 
standard form is obtained by adding m slack variables and x = (0,0,... , 0, b\ ,... , 6 m ; 0) is 
obviously an initial basic feasible solution. In general it is necessary to have an algorithm, 
called the second phase, to find a basic feasible solution, after which the “first phase” 
already discussed is used to complete the solution of the problem. 


(> o) 

(> o) 

(> 0 ) 
(< 0 ) 

(< 0 ) 



ai a2 ... a s ... a n 

e 

b 

a hi 

yis 

0 

U\ 

a hi 

Vis 

0 

Ui 

a h. q 

Vqs 

0 

U q 

a h q+ i 

Vq+l,s 

0 

u q +1 

a h m 

Urns 

0 

Urn 

e 

Vl V2 ... V s ... V n 

1 

Z 


Table 4.1: A two phase tableau. 


4.1. Definition. A two-phase tableau for Ax = b is a tableau with e as the last member 
of the basis and with the rows of the tableau arranged so that all the negative Ui, if any, 
are below all the non-negative rq in the last column. 


41 








42 


CHAPTER 4. THE TWO-PHASE SIMPLEX ALGORITHM 


Table 4.1 shows a two-phase tableau. The rows 1 to q in which m > 0 are called good 
rows and the rows q + 1 to m in which m < 0 are bad rows. The associated basic solution 
is a basic feasible solution if and only if q = m. 


4.2 The Second Phase Described 

As with the one-phase we use a flow chart to summarise the behaviour. We assume in 
Fig. 4.1 that we have a genuine two-phase problem. The chart then shows how to reduce 
the problem to the one phase situation, at which point we can use the flowchart of Fig. 2.1. 


[ START 1 


/ 

Vjn 

r __ _ \ 

Box A 

Is fhere an s such that y{q+1}s < 0? 1 

There is no feasible solution 



l Stop 


Choose such an s 


V 


Is there an r 
yr 

< fq+11 with 

3 >0? 


1 Yes 

r 

Choose r to minimise ur/yrs 
subject to yrs > 0 


1 

Pivot at yrs to fc 

rm a new tableau j 


No 


Pivot at y{q+1}s 
to form a new tableau 

u{q+1}s becomes non-neg. 
each good row stays good 




No 


V 


Do we have u{q+1}> 0? 


Yes 


Are there any more bad rows? 
ie q < m? 


Yes 


No 


J 


M 


Let q = q +1 

so treat the next bad row 


System is now in 
reduced normal form 
Use the one-phase algorithm 


Figure 4.1: Flow chart for two-phase simplex algorithm. 

If q = 77i, proceed immediately to the first-phase. If not, treat the first bad row as the 
objective row in a maximising problem with the aim of increasing u q +± until it becomes 
greater than or equal to 0. Thus, find a y q +i )S < 0, if possible; if it is not possible, we shall 
see that there cannot be any feasible solutions. Then, if possible, choose from the good 
rows a suitable r (as in the one-phase method) and pivot about y rs , thereby increasing the 
value of u q+ \. If there is no such r then pivot about y q+ i jS itself. In this way, the aim is to 
change each bad row in turn into a good row. Fig. 4.1 shows a flow-chart that gives more 
details. 







































4.2. THE SECOND PHASE DESCRIBED 


43 


We will first consider an example in which all the constraints are inequalities. The 
general procedure is to make all the constraints into ’’less than or equals” by multiplying 
by —1 if necessary. Then the constraints are rearranged so that those in which the right 
hand side is negative come below the others. As in the one-phase algorithm, introduce slack 
variables and add the objective row. 

4-2. Example. Maximise 5xi — 3 x 2 + 10 subject to 

2xi — 3x2 > 6, 

xi + 4x2 < 15, 

2xi + 5x2 > —4, 

xi,x 2 > 0. 

Solution A related standard form for this problem is: 

Maximise xq subject to 


xi + 4x 2 + x 3 = 15, 

—2xi — 5x2 + X 4 = 4, 

-2xi + 3x 2 + x 5 = - 6 , 

—5xi + 3x 2 + xq = 10 , 
xi, • • ■ ,x 5 > 0 . 

Table 4.2 shows the tableaus obtained by following the two-phase algorithm. The initial 
tableau Tj of course has an associated basic solution (0, 0,15,4,—6; 10), but notice that 
this is not feasible. Pivoting about y\\ (= 1) gives tableau T 2 with basic feasible solution 
(15, 0,0, 0,34,24; 85) which is seen to be optimal. So an optimal solution of the original 
problem is xi = 15, x 2 = 0 with optimal value 85. 



ai 

a 2 

a 3 

94 

a 5 

e 

b 

a 3 

1* 

4 

1 

0 

0 

0 

15 

9.4 

-2 

-5 

0 

1 

0 

0 

4 

a, 5 

-2 

3 

0 

0 

1 

0 

-6 

e 

-5 

3 

0 

0 

0 

1 

10 

ai 

1 

4 

1 

0 

0 

0 

15 

a 4 

0 

3 

2 

1 

0 

0 

34 

a 5 

0 

11 

2 

0 

1 

0 

24 

e 

0 

23 

5 

0 

0 

1 

85 


Table 4.2: Two tableaus for Example 4.2. 

4-3. Example. Consider the same problem as Example 4.2 except that yu = 0 instead of 1. 

Solution In these circumstances, one uses Box B and pivots about an entry in the bad row 
itself — pivot about y 3 i(= —2). Table 4.3 shows the result. Tableau T 2 has basic feasible 
solution (3, 0,15,10,0; 25) and this time one sees that there is no optimal solution. This 










44 


CHAPTER 4. THE TWO-PHASE SIMPLEX ALGORITHM 


is because v^(= — |) has the proper sign for improvement, but none of the entries in that 
column is positive. 

Note also that V 2 (= — §) also has the proper sign for improvement, and we could indeed 
pivot about the corresponding entry 4 in row 1; but our Theorem 2.23 tells us that there is 
no optimal solution anyway; a further pivot merely postpones the realization of this fact. 

Finally note that in the tableau Ti, column 1 has all its entries < 0. We can’t apply our 
results of Theorem 2.23 directly, since we don’t yet have a feasible solution, and the result 
was only proved assuming we started from a tableau with a basic feasible solution. 



a i 

a 2 

a 3 

34 

a 5 

e 

b 

a 3 

0 

4 

1 

0 

0 

0 

15 

3.4 

-2 

-5 

0 

1 

0 

0 

4 

a 5 

-2* 

3 

0 

0 

1 

0 

-6 

e 

-5 

3 

0 

0 

0 

1 

10 

a 3 

0 

4 

1 

0 

0 

0 

15 

cl 4 

0 

-8 

0 

1 

-1 

0 

10 

a i 

1 

-3/2 

0 

0 

-1/2 

0 

3 

e 

0 

-9/2 

0 

0 

-5/2 

1 

25 


Table 4.3: Example 4.3: pivoting about a bad row. 


4-4- Example. Consider Example 4.2 again, but with a different change; with bi = 2 instead 
of 15. 

Solution Pivoting about yu{= 1) in tableau T\ of Table 4.4 gives tableau T 2 which still has 
a bad row. Since now none of the entries in this bad row is negative, the value u q . i-i(= —2) 
cannot be increased. We therefore reach Box A and the problem has no feasible solutions. 



a i 

a 2 

a 3 

3-4 

a 5 

e 

b 

a 3 

1 * 

4 

1 

0 

0 

0 

2 

34 

-2 

-5 

0 

1 

0 

0 

4 

a 5 

-2 

3 

0 

0 

1 

0 

-6 

e 

-5 

3 

0 

0 

0 

1 

10 

a i 

1 

4 

1 

0 

0 

0 

2 

a 4 

0 

3 

2 

1 

0 

0 

8 

a 5 

0 

11 

2 

0 

1 

0 

-2 

e 

0 

23 

5 

0 

0 

1 

20 


Table 4.4: Example 4.4 with no feasible solutions. 


















4.3. ARTIFICIAL VARIABLES 


45 


4.3 Artificial Variables 

When any of the constraints is an equation, there are two possible approaches. In one we 
replace the equation by a pair of inequalities as discussed in Section 1.1. An alternative way, 
sometimes significantly easier to manage by hand is to introduce a corresponding artificial 
variable. 

In the following example, for the equation x\ + 4 x 2 = 15, an artificial variable X 3 is 
introduced to give x\ + 4 x 2 + X 3 = 15. Then a basic feasible solution of the new problem 
corresponds to a basic feasible solution of the original problem only if X 3 = 0. The new 
problem is strictly speaking not a related standard form of the original. The procedure is 
to make a change of basis in order to replace a 3 and thus ensure that X 3 = 0. The column 
a 3 can then be ignored and the two-phase algorithm followed. 

The reason for the introduction of an artificial variable is that otherwise one doesn’t 
have a genuine tableau with which to start: the matrix obtained is not in canonical form. 

4-5. Example. Maximise 5xi — 3 x 2 + 10 subject to 


2xi — 3x2 

> 

6 , 

Xi + 4x 2 

= 

15, 

2xi + 5x 2 

> 

-4, 

Xl,X 2 

> 

0. 


Solution We obtain the following Linear Programming Problem in standard form: maximise 
xg subject to 


Xi + 4x 2 + £3 

= 15, 

2 xi — 5x2 + X 4 

= 4, 

2 xi + 3x 2 + x 5 

= -6 

5xi + 3x 2 + x 6 

= 10 , 

X 5 

> 0 , 


where the variable X 3 is artificial. Table 4.5 shows that pivoting about y\\ (= 1) in the 
first row removes a3 from the basis. It happens that as a result the value of u q + 1 (= — 6 ) is 
increased and T 2 has no bad rows. The procedure is now to ignore column a 3 and continue. 

In this case we have already arrived at an optimal solution (15, 0,0, 34, 24; 85) which 
has X 3 = 0 as required. So x\ = 15, X 2 = 0 is an optimal solution of the original problem 
with optimal value 85. 


4.3.1 The M method 

There is an interesting re-arrangement of this calculation known as the big M method. 
In this form we deal with the artificial variable by modifying the objective function. Thus 
in Example 4.5, a maximising problem in which X 3 is the artificial variable we work with 
the objective function 


xg = 5xi — 3 x 2 + 10 — Mx 3 


46 


CHAPTER 4. THE TWO-PHASE SIMPLEX ALGORITHM 



a i 

a 2 

a 3 

a 4 

a 5 

e 

b 

a 3 

1 * 

4 

1 

0 

0 

0 

15 

a 4 

-2 

-5 

0 

1 

0 

0 

4 

a 5 

-2 

3 

0 

0 

1 

0 

-6 

e 

-5 

3 

0 

0 

0 

1 

10 

a i 

1 

4 

4 

0 

0 

0 

15 

a 4 

0 

3 

A 

1 

0 

0 

34 

a 5 

0 

11 

A 

0 

1 

0 

24 

e 

0 

23 

A 

0 

0 

1 

85 


Table 4.5: Example 4.5: artificial variables 


where M is a large positive constant. There is no need to specify in advance how large 
it is; instead insist that it is larger than any given competitor, so that for example 
M — 7 or M — 63 are each guaranteed to be positive. Note that X 3 > 0, and that we 
have constructed our auxiliary objective function in such a way that x$ is reduced when 
X 3 > 0. Since M is so large, a maximum of xq is bound to occur when X 3 = 0, unless 
the original problem does not have a feasible solution. We thus set the problem up as 
follows: 


> with(linalg) : 

> A:=matrix(4,7,[1,4,1,0,0,0,15,-2,-5,0,1,0,0,4,-2,3,0,0,1,0,-6,-5,3,M, 

> 0 , 0 , 1 , 10 ]); 



1 

4 

1 

0 

0 

0 

15 

-2 

-5 

0 

1 

0 

0 

4 

-2 

3 

0 

0 

1 

0 

-6 

-5 

3 

M 

0 

0 

1 

10 


Note that this does not give a tableau, but we soon restore one by doing the obvious 
pivot. 


> B:=pivot(A,1,3) ; 


1 4 1 0 0 0 15 

-2 -5 0100 4 

-2 3 0 0 1 0 -6 

—M — 5 -4M + 3 0 0 0 1 —15 M + 10 

At this point we have a “normal” problem to solve; the only unusual feature is in looking 
for a proper sign for improvement, when we have to remember that M > 0 is large when 
determining the signs in the objective row. Thus in this case both columns 1 and 2 are 
negative; we choose to pivot in column 1. 


> C:=pivot(B,1,1) ; 


C := 


1 4 1 0 0 0 15 

0 3 2 1 0 0 34 

0 11 2 0 1 0 24 

0 23 M + 5 0 0 1 85 
















4.3. ARTIFICIAL VARIABLES 


47 


This is our final tableau; since M > 0 is large, certainly M > 5 . 

4.3.2 Another Example 

4-6. Example. Minimise 2 a:i — 3 x2 + subject to 

3 xi — 2 x 2 + %3 < 5 , 
xi + 3 x 2 — 4 xs < 9 , 

X 2 + 5 x 3 > 1 , 
xi + x 2 + x 3 = 6, 

34,2:2,3:3 > 0 . 


T\ 


T 2 


T3 


T 4 


Table 4.6: Four tableaus for Example 4.6. 

Solution We obtain the linear programming problem: minimise x 8 subject to 

3xi — 2x2 + x 3 + x 4 = 5, 
xi + 3x 2 - 4 x 3 + x 5 = 9, 

24 + x 2 + x 3 + x 6 = 6, 

-x 2 -5x 3 + X 7 = -1, 

-2xi + 3x 2 - x 3 + x 8 = 0, 

Xi,... ,X 7 > 0, 



a i 

a 2 

a 3 

a 4 

a 5 

a 6 

a 7 

e 

b 

3.4 

3 

-2 

1 

1 

0 

0 

0 

0 

5 

a 5 

1 

3 

-4 

0 

1 

0 

0 

0 

9 

a 6 

1 

1 

1* 

0 

0 

1 

0 

0 

6 

a 7 

0 

-1 

-5 

0 

0 

0 

1 

0 

-1 

e 

-2 

3 

-1 

0 

0 

0 

0 

1 

0 

34 

2 

-3 

0 

1 

0 

-1 

0 

0 

-1 

a 5 

5 

7 

0 

0 

1 

4 

0 

0 

33 

a 3 

1 

1 

1 

0 

0 

1 

0 

0 

6 

a 7 

5 

4 

0 

0 

0 

5 

1 

0 

29 

e 

-1 

4 

0 

0 

0 

1 

0 

1 

6 

a 5 

5 

7* 

0 

0 

1 


0 

0 

33 

a 3 

1 

1 

1 

0 

0 


0 

0 

6 

a 7 

5 

4 

0 

0 

0 


1 

0 

29 

34 

2 

-3 

0 

1 

0 


0 

0 

-1 

e 

-1 

4 

0 

0 

0 


0 

1 

6 

a 2 

5/7 

1 

0 

0 

1/7 


0 

0 

33/7 

a 3 

2/7 

0 

1 

0 

- 1/7 


0 

0 

9/7 

a 7 

15/7 

0 

0 

0 

- 4/7 


1 

0 

71/7 

34 

29/7 

0 

0 

1 

3/7 


0 

0 

92/7 

e 

- 27/7 

0 

0 

0 

- 4/7 


0 

1 

- 90/7 














48 


CHAPTER 4. THE TWO-PHASE SIMPLEX ALGORITHM 


where xg is an artificial variable. In tableau T\ of Table 4.6, pivoting about 2/33 (= 1) 
removes ag from the basis. The rows of tableau T 2 are then rearranged to give tableau 
T 3 so that the bad row is below the others, and column ag is ignored from here on. 
Pivoting in T 3 about y 12 (= 7) gives tableau T 4 which has the basic feasible solution 
(0,33/7,9/7,92/7,0,0,71/7,-90/7). This has xg = 0 and is an optimal solution, so 
x\ = 0, £C 2 = 33/7, X 3 = 9/7 is an optimal solution of the original problem with optimal 
value —90/7. 


4.3.3 Minimising with the Big M Method 

We can use the big M method on this example. Since we are seeking a minimum this time, 
we have as objective function 


xg = 2xi — 3 x 2 + £3 + Mx 6 . 


Again we will only obtain a feasible solution of the original problem if in the new problem, 
xg = 0. The calculations are straightforward but instructive. 


> 

> 


A:=matrix(5,9, [3,-2,1,1,0,0,0,0,5,1,3,-4,0,1,0,0,0,9,1,1,1,0,0,1,0,0, 
6,0,-l,-5,0,0,0,l,0,-l,-2,3,-l,0,0,-M,0,l,0]); 


A : = 


B:=pivot(A,3,6); 


1 

1 

0 

-2 


B := 


> C:=pivot(B,2,2); 

11 

T 

1 

2 

C:= 3 

1 

3 


0 

3 

0 


-2 1 

1 

3 -4 

0 

1 1 

0 

-1 -5 

0 

3 -1 

0 

-2 

1 

3 

-4 

1 

1 

-1 

-5 

M + 3 M — 

-5 

1 

IT 

-4 

0 

7 

3 

0 

-19 

3 

0 

\ M+ 3 

0 


0 0 
0 0 


5 
9 

0 0 6 
1 0 -1 
0 1 0 


1 0 0 0 0 5 

0 1 0 0 0 9 

0 0 1 0 0 6 

0 0 0 1 0 -1 


2 

3 

1 

-1 

IT 

1 

3 


0 0 0 

0 0 0 
1 0 0 

0 1 0 


11 

9 

3 


- M — 3 0 -M + 3 0 —- M — 1 0 0 1 3M-9 


> E:=pivot(C,3,3); 









4.4. A PARTIAL PROOF 


49 



1 

-*|g 

0 

0 

1 

3 

7 

5 

7 


0 

0 

92 

y 


15 

y 

3 

0 

0 

3 

7 

12 

y 


0 

0 

99 

y 

E := 

2 

3 

0 

7 

3 

0 

-1 

IT 

i 


0 

0 

3 


15 

y 

0 

0 

0 

-4 

y 

19 

y 


1 

0 

71 

y 


b- 

_1 

0 

0 

0 

-4 

y 

-M - 

9 

7 

0 

1 

-90 

7 


4.4 A Partial Proof 

As in the case of the one-phase algorithm, it is necessary to prove certain claims made in 
the description of the process. We shall not cope with the problem of cycling, but, referring 
to Fig. 4.1, will prove that 

1. the claim of Box A is true; and 

2. the claim of Box B is true. 

This will show that, apart from cycling, the treatment of each bad row either comes to a 
stop at Box A or the bad row is turned into a good row. So, after a finite number of steps, 
either the process stops or it enters the one-phase algorithm when all the bad rows have 
been eliminated. Apart from the difficulties of cycling, this establishes that the process is 
a genuine algorithm. 

4.7. Theorem. In a two-phase algorithm, if u q +± < 0 and y q +i, s > 0 for all s then there 
is no feasible solution. 

Proof. Our methodology carries with it a proof of this result. We are trying at this stage 
to increase u q+ \ to make it positive, and have concluded the the one-phase process has 
converged, and that the objective function of this problem, namely u q+ 1 , is as large as 
possible. Since in any feasible solution it must be positive, there can be no feasible solutions. 

□ 

4.8. Theorem. If u q +i < 0 and y q +i )S < 0 and there is no r with 1 < r < q such that 
y rs > 0, then pivoting at y q +i )S turns this bad row into a good row and all the good rows 
above remain as good rows. 

Proof. We carry out the calculation as described in the theorem. The manipulations are 
shown in Table 4.7. We see that after performing the pivot, all the good rows stay good, 
since we are given that yi s < 0 for every good row i, while the first bad row, about which 
we pivot, becomes good, since each of the terms in the quotient u q+ \/y q+ i tS is negative. 

□ 






50 


CHAPTER 4. THE TWO-PHASE SIMPLEX ALGORITHM 



ai a .2 ... a s ... a n 

e 

b 

a hx 

yis 

0 

U\ 

a hi 

Vis 

0 

Ui 

a h q 

Dqs 

0 

u q 

a h q+ i 

2 / 9 + 1 , s 

0 

U q +1 

a hm 

Urns 

0 


e 

Vi V 2 ■■■ V s ... V n 

1 

z 

&hi 

0 

0 

Ui - y ljS (u q+ i/y q+ i tS ) 

a hi 

0 

0 

Vi 2/i,s(Hg+l/2/g+l,s) 

a h q 

0 

0 

u q ~ yq,s{ u q+l/yq+l,s) 

a h q +1 

1 

0 

u q+l/yq+l,s 


0 

0 

* 

e 

v[ v' 2 ... 0 ... v' n 

1 

? 


(> 0) 

(> 0) 

(> 0) 
(< 0 ) 

(< 0 ) 
(> 0 ) 

(> 0 ) 

(> 0 ) 
(> 0) 

(?) 


Table 4.7: Pivoting on a bad row; we don’t care about v' 2 , ■ ■ ■, v' n at this stage. 


4.5 Avoiding Non-Negativity 

We can show that, given a linear programming problem in which not all the variables are 
required to be non-negative, it is possible to form a related problem with non-negativity 
restrictions, whose solution yields the solution of the original problem. If Xj is any such 
free variable (i.e. without non-negativity restriction), the method is to write Xj = xl — x ”, 
with x'j and x" satisfying non-negativity restrictions x) > 0, x” > 0. This gives a related 
problem with more variables but having non-negativity restrictions. 

4-9. Example. Let P be the linear programming problem: maximise 3xi + X 2 subject to 


1 

CO 

to 

> 

-3 

2xi + 3x 2 

> 

-6 

2xi + x 2 

< 

8, 

4xi - x 2 

< 

16. 


Solution There are no non-negativity conditions so we write x\ = x' x — x'[ and x 2 = x' 2 — x 2 
and obtain the related problem P': 












4.5. AVOIDING NON-NEGATIVITY 


51 


Maximise 3x^ — 3x'[ + x' 2 — x 2 subject to 


x[ - 

- A - 

3x 2 + 3 x 2 

> 

-3 

A- 

2x'{ + 

3x 2 — 3x 2 

> 

-6 

2x[ 

- 2x\ 

+ A - x 2 

< 

8, 

4x[ 

- 4 x" 

— x 2 + x 2 

< 

16, 


x\ 

> X \, X 2 , x 2 

> 

0. 


It isn’t hard to see that when this related problem P 1 is constructed we have: 


1. for each solution of P with objective value z there exists a solution of P' with the 
same objective value; and 


2. for each solution of P' with objective value z there corresponds a unique solution of 
P, which has the same objective value. 


There is some abuse of terminology because relating to P we strictly speaking do not 
have ’feasible’ solutions. Nevertheless it follows from (i) and (ii) that if P has an optimal 
solution then so does P', and conversely, the optimal value being the same, and if P has 
no optimal solution then neither does P', and conversely. 

Here is a Maple solution to show how the pivoting goes: 

> with(linalg): A:=matrix([[-1,1 ,3 ,- 3 ], [-2 , 2 ,- 3 , 3 ], 

> [2,-2, 1 ,- 1 ] ,[4,-4, - 1 , 1 ] , [-3,3, - 1 , 1 ]]): 


vector([3,6,8 

,16,0]) 


A1:=concat(A,diag(l 

,1, 

T—1 

T—1 


' -1 


1 


3 - 

3 1 

0 

0 

0 

0 

3 


-2 


2 


3 

3 0 

1 

0 

0 

0 

6 

A1 : = 

2 


2 


1 - 

1 0 

0 

1 

0 

0 

8 


4 


4 


1 

1 0 

0 

0 

1 

0 

16 


. -3 


3 


1 

1 0 

0 

0 

0 

1 

0 

=mulrow(Al,3, 

1/2) 

A2 


: pivot(A2,3, 

l); 






0 

0 


7 

2 

-7 

IT 

1 

0 

1 

2 

0 

0 

7 


0 

0 


-2 

2 

0 

1 

1 

0 

0 

14 

A2 : = 

1 

-1 


1 

2 

-1 

IT 

0 

0 

1 

2 

0 

0 

4 


0 

0 


-3 

3 

0 

0 

-2 

1 

0 

0 


0 

0 


1 

2 

-1 

IT 

0 

0 

3 

2 

0 

1 

12 


> A3:=mulrow(A2,4,1/3):A3:=pivot(A3,4,4); 






52 


CHAPTER 4. THE TWO-PHASE SIMPLEX ALGORITHM 


A3 : = 


0 0 0 0 1 0 


0 0 0 0 0 1 


1 -1 


0 0 0 0 


0 0-1100 


Thus we have x\ =4, 


0 

= 0, 


0 0 0 0 - 


-11 

“(T 

7 

3 

1 

6 

-2 

IT 

7 

6 


7 

6 

-2 

IT 

1 

6 

1 

3 

1 

6 


0 7 


0 14 


0 0 


- 1 12 


X 3 = 7 and x 4 = 14. The remaining variables, x", 


h 2 1 


X 5 and xq are not basic variables and so are zero. The maximum value of the objective 
function is 12. In terms of the original variables, this is attained when x\ = 4 and X 2 = 0. 


Other Substitutions A similar use of substitution can be made if, for example, a prob¬ 
lem has, instead of non-negativity restrictions, other restrictions like x\ > 2 , X 2 < 6 , 
£3 > —4. One then substitutes x\ = x\ — 2, x ' 2 = 6 — 0 : 2 , x 3 = 0:3 + 4 for then x^, x 2 , x 3 > 0. 


4.6 An Alternative View 

We finish this chapter by showing an alternative way of reducing problems to the one-phase 
algorithm which makes greater use of artificial variables. This is the approach taken by 
Press et al. (1992) to the two phase algorithm. Again we illustrate with an example. 

4-10. Example. Maximise z = x\ + X 2 + 3 x 3 — 5 X 4 subject to x* > 0 for 1 < i < 4 and 

xi + 2 x 3 < 740, 

2X2 — 7x4 < 0, 

X 2 ~ X 3 + 2 x 4 > 
xi + x 2 + x 3 + x 4 = 9. 

To help keep count, I’m going to use yi for a slack variable and Zi for an artificial 
variable. We first introduce slack variables to give 


xi + 2x 3 + V 1 = 740, 

2x 2 - 7x 4 + 2 / 2 = 0 , 

X 2 -X 3 + 2 x 4 -2/3 = 5 , 
xi + x 2 + x 3 + x 4 = 9. 

Of course we only need three slack variables, and the third constraint shows we are 
going to need the two-phase version of the algorithm. Now comes the alternative approach 
to artificial variables — lets use lots!. Specifically we define an artificial variable for each 





4.7. QUESTIONS 3 


53 


constraint by 


zi = 740 - x\ - 2 x 2 ~ U l, 

22 = -2x 2 + 7 x 4 - 2/2, 

^3 = \ - X2 + x 3 - 2X4 + 2/3, 

24 = 9 — Xi — X2 — X3 — X4. 

Thus we have introduced a “full set” of artificial variables 24 > 0 , 22 > 0 , 23 > 0 and 24 > 0 . 
We chose the sign of the right hand side to ensure that each constant is positive. Thus 
we could write the above system in tableau form; we have then chosen things so this is a 
one-phase problem; specifically there is a feasible solution with only the artificial variables 
non-zero. 

However this isn’t the same system as the original unless each z % is zero. Note that 
z! = z\ + 22 + 23 + 24 > 0 + 0 + 0 + 0 > 0 . Thus the minimum value of z' is non-negative. 
Note also that if z' = 0 then each 2* = 0 . 

Thus if we can find a basic feasible solution to this new problem, in which 2 ' = 0 , then 
all our artificial variables vanish, and we can go to work on the “proper” variables, and 
the “real” objective function exactly as we did in the third tableau of Example 4 . 6 . So 
we start by working on an auxiliary objective function 2' = 21 + 22 + 23 + 24 which we 
seek to minimise. On the way, we transform the real objective function during each pivot, 
thus ensuring our new system remains row-equivalent to the old. Considering the auxiliary 
objective function then gives an alternative to the use of “bad rows” as a way of finding a 
basic feasible solution to our original problem. 

Notice also that if our original problem has a basic feasible solution, then that solution 
gives the correct minimum value for the auxiliary objective function. So the above method 
will succeed providing the original problem has a solution. 

To summarise this method: 

• introduce slack variables so each constraint is an equality; 

• introduce artificial variables with signs chosen so the new problem has a basic feasible 
solution; 

• minimise the auxiliary objective function; and 

• use the resulting basic feasible solution of the original problem to start the one-phase 
algorithm. 

• Finally note that if the auxiliary objective function cannot be reduced to zero, the 
original problem has no solution. 

4.11. Example. Use Maple to solve the problem in Example 4 . 10 . Try it using the min¬ 
imum number of artificial variables as described in Section 4.3 and also using a “full set” 
of them as just described. Summarise the advantages and disadvantages of each method. 


4.7 Questions 3 (Hints and solutions start on page 112.) 


54 


CHAPTER 4. THE TWO-PHASE SIMPLEX ALGORITHM 


4-1. Q. a) In carrying out the two-phase simplex algorithm, assume you arrive at the following 
tableau, with one bad row and q good rows. 


Basis 

ai 

a s 


e 

b 

a/n 

2/i,i 

2/1,a 

2/1, n 

0 

Ml (> 0) 

a h q 


Vq,s 


0 

U q (> 0 ) 

a h q+ i 

2 / 94 - 1,1 

Uq+l,s 

2/941,n 

0 

Uq+l (< 0 ) 

e 

Ui 

U s 

Un 

1 

z 


The next step in the algorithm forces a s to be introduced into the basis. Given that yi tS < 0 
for 1 < * < g, and that y q + i jS < 0, about which entry in the column should you pivot? Prove that 
your choice leads to a new tableau corresponding to a basic feasible solution. 

b) Consider the linear programming problem: 

minimise 6xi + 7 x 2 + £3 + 15x4 subject to x\ > 0, X 2 > 0, X 3 > 0, X 4 > 0 and 


X\ + X 2 + 3x4 = 6, 

Xi + x 2 — x 3 + 2 x 4 > 5, 

Apply the simplex method to show that the optimal value is 33, and give values of the variables 
at which this is attained. Describe briefly the reasoning behind each step. [When you have a free 
choice of pivots in a row, you are recommended to choose the one furthest to the left.] 

4-2. Q. Explain how the simplex method, when applied to an unbounded linear programming prob¬ 
lem, will eventually reveal the fact that it is unbounded. 

Use the Simplex Algorithm to minimise — 6 x 1 + 2 x 2 — 3 x 3 , subject to Xi,X 2 ,X 3 > 0 and 

X\ + X 2 — 2x 3 < 2, 

-4xi + x 2 — x 3 > 6, 

Xi + 3x2 — 8 x 3 < 8 . 

Describe briefly the reasoning behind each step. 


4-3. Q. a) A small portion of a tableau giving a basic feasible solution of a linear programming 
problem is shown in the following table and the simplex algorithm indicates the need to pivot about 
the entry y rs . 


Basis 

... a s ... 

b 

a hi 

■ ■ ■ Vis ■■■ 

Ui 

a h T 

Vrs 

u r 


By performing this pivot on the entries shown, giving the resulting partial tableau, explain why 
it was necessary to consider certain ratios before choosing to pivot in row r. 













4.7. QUESTIONS 3 


55 


b) Use the Simplex Algorithm to minimise 

—2x\ + X2 — 4^3, subject to X\, X2, X3 > 0 and 

Xi + 

X3 

<4, 

X\ — 2x 2 

- 3x 3 

= 2, 

X\ — ?>X2 

- x 3 

< 1 . 

Describe briefly the reasoning behind each step. 



4-4- Q- Use the Simplex Algorithm to minimise 

6x1 — X2 

+ 12x 3 , subject to Xi,X2,x 3 > 0 and 

Xi + 

3x 3 

<2, 

2xi — X2 

- x 3 

= 3, 

4 x\ - x 2 

- 2x 3 

> 7. 


Describe briefly the reasoning behind each step. 


56 


CHAPTER 4. THE TWO-PHASE SIMPLEX ALGORITHM 


Chapter 5 


Duality 


5.1 Formulation of the Dual 

5.1. Definition. Given a Linear Programming Problem (called the primal problem) in 
the form: maximise c\X\ + C 2 X 2 + ... + c n x n subject to 

n 

ciijXj < bi for i = 1 , 2 ,... , m, 

3= 1 

Xj > 0 for j = 1 , 2 ,... , n. 

then its dual is the problem: minimise b\W\ 4- 62^2 + ... + b m w m subject to 

m 

yy djiWj > Ci for i = 1 , 2 ,... , n, 

3 =1 

Wj > 0 for j = 1 , 2 ,... , m. 

In matrix notation, the primal can be written as: maximise c 7 x subject to Ax < b, 
x > 0. and the dual as: minimise b T w subject to A T w > c, w > 0. 

It is clear that any Linear Programming Problem can be put in the form of a primal 
problem. Thus a minimising problem can be reformulated as a maximising problem by 
changing the sign of the objective function and any > inequalities can be reformulated as 
< inequalities. 

There is a particular way of dealing with constraints that are equations. Certainly any 
equation is equivalent to two inequalities which can both be written as < inequalities. In 
the resulting dual one finds that the two corresponding variables can be replaced by one 
variable, which however is unrestricted. Thus, if the i th constraint in the primal (before 
reformulation) is an equation then the i th variable in the dual (after reformulation in the 
way illustrated below) is unrestricted. 

5.2. Example. Find the dual of the problem: maximise 10xi + X 2 + 4 x 3 + 7 x 4 subject to 


3 xi — 2x2 + 7x3 — 6x4 

< 

5, 

8x1 + 4x2 — IIX3 + X4 

> 

12 

9 xi + 15 x 2 + 14 x 3 + 10X4 

= 

13 

xi, X2, X3, X4 

> 

0 . 


57 


58 


CHAPTER 5 . DUALITY 


Solution Putting this in the required form gives the primal: 
maximise 10xi + x 2 + 4x3 + 7x4 subject to 


3xi 

- 2x 2 + 7x 3 

— 6 X 4 

< 

5, 

— 8 x 1 

- 4x 2 + lla ?3 “ 

< 

- 12 , 

9xi + 15x 2 + 14 x 3 4 

- IOX 4 

< 

13, 

-9xi — 15x2 — 14x3 ~ 

- IOX 4 

< 

-13, 


Xl,X 2 , 

CO 

4^ 

> 

0 . 

>: minimise 5uq — 12w2 + 13w 3 ■ 

- 13tx| 

3uq 

— 81V2 + 9u4 

-9tx" 

> 

10 , 

-2wi — 

Aw 2 + 15^3 - 

- 157 x 3 

> 

1 , 

7wi + llu ;2 + 14rc3 - 

- 147 x 3 

> 

4, 

— 6uq - 

- U>2 + 107 X 3 ' 

- IO 7 X 3 

> 

7 , 


7X1,7X 2 , 

/ // 

tx 3 , tx 3 

> 

0. 


Now let W 3 = w ' 3 — w'f and the problem becomes: minimise 5 txi — 12 w 2 + 13tx 3 subject to 

3txi — 8 W 2 + 9ws > 10, 

—2txi — 4tx 2 + 15tx3 > 1, 

7 w\ + 11^2 + 14rf3 > 4, 

— 6 wi — W 2 + IO 7 X 3 > 7, 

Wi,W2 > 0 , 

noting that 7x3, corresponding to the third of the original constraints which was an equation, 
is unrestricted. 


5.3. Theorem. The dual of the dual is the primal. 


Proof. This is simply a matter of chasing the definitions, but is worthwhile because it gives 
familiarity with the concepts. We write the primal as: 

maximise c T x subject to /lx < b and x > 0, 

and so the dual is: 

minimise b T w subject to A T w > c and w > 0. 

We now have to reformulate this as a primal problem, and so write it as a maximising 
problem: 

maximise (— b) 7 w subject to (— A) T w < — c and w > 0. 

In this form we can again write down a dual; we call it the bidual, because it is the dual of 
the (reformulated) dual. The bidual is: 

minimise (— c) 7 x subject to((— j 4) t ) T x > ((— b) r ) T and x > 0. 


5 . 2 . THE FUNDAMENTAL THEOREM 


59 


Rewriting this to remove double transposes, and negative signs gives 
maximise c 7 x subject to Ax < b and x > 0, 

which is the original problem. □ 

A consequence of the theorem is that, if desired, the primal can be formulated in the 
form: minimise c T x subject to Ax > b,x > 0, in which case its dual is the problem: 
maximise b 7 w subject to A 7 w < c, w > 0. In other words, we can easily confuse the 
roles of maximising and minimising, without running into confusion. 1 

It may also be useful to note that given a problem in which all the variables are un¬ 
restricted, one can use the technique above in reverse to obtain as its dual a problem in 
which all the constraints are equations. We have already seen a technique for solving such 
problems, and we are next going to show that solutions of the primal and dual problems 
are related. 

5.2 The Fundamental Theorem 

5.4. Lemma. Let x and w be feasible solutions of the above primal problem and its dual. 
Then c r x < b T w. 

Proof. This result is a simple symbol chase from the hypothesis: since x and w are feasible 
solutions of the above primal problem and its dual, with the notation we have been using 
as standard, we know that 

n m 

and Y^ aijWi > Ci for i = 1,2,... , n. 

2=1 


interchanging the order of summation, 
n 

> J2 c i x r 

3= 1 

which is the result claimed. □ 

You may prefer the following vector version of this — it is the same argument expressed 
in a different way. Note first that if u < v and x > 0 then x T u < x r v since we first 
multiply an inequality by Xj > 0 and then add. Then, since c < A T w, and x > 0, we have 

x r c < x r A r w = (Ax) r w < b r w 

where in the last inequality we have used the note above and the fact that Ax < b. Since 
x 1 c = c r x, the result follows. 

1 I hope you aren’t confused! 


aijXj < bi for i = 1,2,... , m, 

3 = 1 

Computing, we thus have 


min 


Y, b i W i ^ Y uC 


&ij Xj I Wi 


2—1 


2=1 \j = 1 
n / m 

EE 

j =1 \i=1 


Q'ij I %j 



60 


CHAPTER 5. DUALITY 


5.5. Lemma. Let x and w be feasible solutions of the above primal problem and its dual. 
If c T x = h 1 w then x and w are optimal solutions of the primal and dual. 

Proof. This is a simple consequence of Lemma 5 . 4 . Recall that for the primal problem, we 
are trying to maximise 

n m 

Y c 3 x i < Y biWi > 

j=l i =1 

where we have used Lemma 5.4 to get the inequality. If the above inequality is actually an 
equality, then however we change any Xj cannot increase the value of the objective function 
c 7 x. And in the same way, however we change any w\ cannot decrease the value of the 
objective function h 1 w of the dual problem. So both must already correspond to optimal 
solutions; indeed we see that both the primal and the dual problem have optimal solutions 
with the same optimal value. □ 

5.6. Theorem. If a Linear Programming Problem has an optimal solution then its dual 
has an optimal solution, with the same optimal value. 

Proof. We start with the primal:- 

maximise c 7 x subject to Ax < b and x > 0, 

and we assume the primal problem has an optimal solution x, with optimal value z = c 7 x. 
Then the initial tableau is as shown in Table 5 . 1 . Note that the last row uses the fact that 
we know the coefficients of the objective function; we are also relying on the use of the 
techniques we have already discussed (including the use of artificial variables) to introduce 
the correct number (a total of m) of slack variables. 


Basis 

a .1 &2 

3-n 

Slack Variables 

e 

b 

Last 





0 

hi 

m 





0 

b 2 




Ira 




columns 





0 

bm 

e 

1 

1 

Ci 

to 

Cn 

0 

0 

1 

0 


Table 5.1: Initial tableau for the primal Ax = b. 

Now consider the situation when the simplex algorithm has terminated to give the final 
tableau shown in Table 5 . 2 . Since is is a final tableau, none of the v±,... ,v n ,w\,... ,w m has 
the proper sign for improvement, and so Vj > 0 and Wi > 0 for all i,j. 

Consider next the way in which the final tableau is obtained from the initial tableau; 
it is simply by a finite number of consecutive pivoting operations, on rows excluding the 
last row. This means that the last row in the final tableau can be written as a linear 
combination of the rows in the initial tableau of the form 

Last row of final tableau = Last row of initial tableau 

+ Ai x first row of initial tableau + ... + 

+ \ m x m th row of initial tableau. 









5.2. THE FUNDAMENTAL THEOREM 


61 


Basis 

ai 

a 2 • 

a n 

Slack Variables 

e 

b 

Basic 






0 

Ul 

m 






0 

U 2 

columns 






0 


e 

Vl 

V2 ■ 

• Vn 

W\ 

w m 

1 

Z 


Table 5.2: Final tableau for the primal Ax = b. 

Now note that the initial tableau has \ m in the last mn columns of the body of the tableau, 
and so we can read off the relevant values of A; we have Ai = wi, A 2 = w 2 , ■ ■ ■ , A m = w m . 
Thus we can compute each vj\ we have 

m 

v j = —Cj + W\(1\ 3 + ... + w m a m j (j = 1,... ,ri) and z = ^ Wibi. 

i— 1 

In vector form, we thus have that z = b T w, and A 7 w = v + c > c, since v > 0; 
and also w > 0. This shows that w is a feasible solution of the dual problem, and that 
c 1 x = z = b 7 w. Thus by Lemma 5.5, both x and w are optimal. □ 

5.7. Corollary. If a Linear Programming Problem and its dual both have feasible solutions 
then they both have optimal solutions. 

Proof. By hypothesis, we have feasible solutions of the primal and dual problems respect¬ 
ively, say x and w. Then by Lemma 5.4, c T x < b 7 w. Thus c 7 x is bounded above, and 
so when we apply the simplex algorithm, it does not terminate in box B of the one-phase 
algorithm, and as such is guaranteed to terminate with an optimal solution. Then by 
Theorem 5.6, both the primal and dual problems have an optimal solution. □ 

To complete the possibilities, we note that the following cases can arise. 

1. The primal and dual may both have no feasible solutions. Consider the primal Linear 
Programming Problem: maximise x\ + x 2 + x 3 subject to 


X 1 +X 2 - x 3 

< 1, 

Xl ~ X2 + X 3 

< -2, 

X 1 ,X2,X 3 

IV 

0 


and its dual. 

2. It can be that the primal has no feasible solutions but the dual does have feasible 
solutions. Consider the primal Linear Programming Problem: maximise x\ subject 
to xi < 1, —x\ < —2, x\ > 0 and its dual. 

The dual problem may be worth considering in cases where the primal problem is 
difficult to solve. For example, the primal may require the two-phase algorithm and the 









62 


CHAPTER 5. DUALITY 


dual may not. But the notion of dual is important because where a Linear Programming 
Problem arises from a practical problem the dual may have some interpretation that it is 
useful to appreciate. 

5.8. Example. We work the first of these results as an example. 

Solution In the primal problem an attempt to turn the bad row into a good row fails; the 
best we can do is raise the value of the constant to —1, so there is no feasible solution of 
the primal problem. The tableau is given in Table 5.3 


Basis 

ai 

a 2 

&3 

a 4 

a 5 

e 

b 

a 4 

1 

1 

-1 

1 

0 

0 

1 

&5 

1 

-1 

1 

0 

1 

0 

-2 

e 

-1 

-1 

-1 

0 

0 

1 

0 

a 4 

1 

1 

-1 

1 

0 

0 

1 

&5 

2 

0 

0 

0 

1 

0 

-1 

e 

0 

0 

-2 

1 

0 

1 

0 


Table 5.3: Duality example: the primal problem has no optimal solution. 


The primal problem has 


c = (1,1,1), 



b = 



and so the dual becomes: 


minimise w\ — 2w2 subject to 


Wi + W2 

> 

1 , 

W\ — W2 

> 

1 , 

-W\ + W 2 

> 

1 , 

Wl,W 2 ,W 3 

> 

0 . 


In this case it is clear that there is no feasible solution, since the last constraint can be 
written as w\ — W2 < — 1 , which is clearly incompatible with the second constraint. 


5.3 Shadow Prices 

Often the dual of a Linear Programming Problem has a natural interpretation. We illustrate 
this with an example. 

A poor student has decided that she needs to obtain each day an additional bi grams 
of vitamin Vj for 1 < i < m by consuming additional quantities of a number of foods Fj 
for 1 < j < n. Food Fj contains a total of a^- grams of vitamin Vj and costs £cj per gram. 
These values are given for each i and j. Our first problem is to determine the amount, in 
grams, of each food to be consumed in order to provide the required additional vitamins at 
the least cost. 

To this end, let Xj be the amount, in grams, of food Fj to be consumed; clearly we 
have x > 0. The total cost, which we want to minimise, is then c T x. With this choice, 








5.4. THE DUAL SIMPLEX METHOD 


63 


n 

we consume a ijXj grams of vitamin Vi, so we meet the required intake provided that 
3= 1 

Ax > b. 

At this point a wicked salesman shows up with a stock of vitamin pills, and rather 
flexible prices. He wishes to persuade our poor student not to get her extra vitamins 
from real food, but instead to buy them directly in the form of vitamin pills. For each 
1 < i < m, pill Pi contains one gram of Vi. What price should the salesman charge for Pi 
in order that our student abandon her healthy lifestyle in favour of a cheaper solution, and 
yet the salesman still maximises his profit? 

If the salesman charges £wi for each Pi, his total income will be w r b, which we seek 
to maximise. 2 Clearly w > 0. What other constraints are there on w? One way to express 
it is that the cost of pills to supply the vitamin content of 1 gram of Vi should be less than 
the cost of the food. Since Fj supplies aij grams of vitamin Vi, we must have 


m 

ciijWi < Cj for 1 < j < n 

Z=1 


or A r w < c. 

In other words, the salesman’s problem is dual to that of the student’s. 

Let z* be the optimal value; the minimum of c T x. We can consider z* as a function of 
the given a^, hi and Cj. Then if 


dz* 

dbi 


= \i 


we observe that A i is the sensitivity of z to changes in bi, or the increase in the objective 
function if bi is increased by 1. We call A j the shadow price of the vitamin. The above 
argument suggest that these give the solution to the dual problem. 


5.4 The Dual Simplex Method 

Although we don’t go into details here, there is another reason why duality proves valuable, 
associated with the need, in real life, to modify the constraints of a problem. Supposed we 
have already solved (say) a maximising problem. We can describe the solution as feasible 
(every entry in the last column is non-negative) and optimal (every entry in the bottom row 
is non-negative). Adding an additional constraint is likely to render the “current” solution 
infeasible, although it will remain optimal in the above sense. Applying the conventional 
algorithm will first work on the (potentially) bad row, and then re-do the optimisation. In 
contrast the dual problem is likely to become non optimal (a bad row “duals” to a proper 
sign for improvement) but feasible (an optimal objective row duals to a feasible but non 
optimal solution). As such, restoring optimality is likely in practice to be quicker working 
on the dual. For more details, see for example Kolrnan & Beck (1995, Section 3.4) 

2 We ignore the unlikely situation in which he sells a pill for below his cost; the pills are very cheap to 
make! 



64 


CHAPTER 5. DUALITY 


5.5 Questions 4 (Hints and solutions start on page 118.) 

5.1. Q. Write down the dual of the linear programming problem given in Question 4.1, and show 
that the dual can be transformed into a linear programming problem in which one of the variables is 
unrestricted. Using the information contained in Question 4.1, give the optimal value for the dual, 
and explain your reasoning. [You are not expected to solve the problem.] 


5.2. Q. a) A linear programming problem, the “primal” problem, is given in the form: 

maximise c T x subject to the constraints Ax < b and x > 0. 

Write down the dual problem. Given that x and w are feasible solutions of the above primal and 
its dual, show that c 1 x < b T w. What can you deduce if c T x = b T w? Prove your assertion. 

b) The following tableau is derived from a linear programming problem of the above form by 
inserting x 4 , X 5 and Xg as slack variables. 


Basis 

a i 

a 2 

a 3 

a 4 

a 5 

a 6 

e 

b 

a 4 

2 

3 

-1 

1 

0 

0 

0 

4 

a 5 

-1 

-2 

0 

0 

1 

0 

0 

3 

a 6 

-1 

1 * 

-2 

0 

0 

1 

0 

1 

e 

-10 

-20 

10 

0 

0 

0 

1 

0 


Write down the original problem in “primal” form and hence write down the dual problem. 
After one pivot in the above tableau, the following tableau is obtained. 


Basis 

a i 

a 2 

a 3 

a 4 

a 5 

a 6 

e 

b 

a 4 

5 

0 

5 

1 

0 

-3 

0 

1 

a 5 

-3 

0 

-4 

0 

1 

2 

0 

5 

a 2 

-1 

1 

-2 

0 

0 

1 

0 

1 

e 

-30 

0 

-30 

0 

0 

20 

1 

20 


Solve the original linear programming problem and also write down a solution of the dual 
problem. 

5.3. Q. Let P be the linear programming problem: minimise 8*1 + 5x2 + 4*3 subject to the con¬ 
straints X\ > 0 , X 2 > 0 , x 3 > 0 , and 


X\ + 5x 2 — X3 > 1, 
—Xi + 2x 2 - x 3 < 4. 


Write down the dual problem P *. What can you conclude if both P and P* have feasible solutions? 










Chapter 6 

The Theory of Games 


6.1 Matrix Games 


A game is, in general terms, a conflict situation between opponents that can be rationally 
analysed. The class of games we shall study are those in which the possible outcomes can 
be specified by means of a matrix: 


6.1. Definition. Let there be two players, R (or Rowrnan) and C (or Columnman) and 
let A = [aij\ be an m x n matrix. In the matrix game associated with A, R chooses a 
row a i and simultaneously C chooses a column ab As a result C pays to R an amount of 
dij units. 

It is a convention that the matrix element gives the amount that C pays to R, called 
the payoff. Such a game is a “zero-sum” game in that the total amount received by the 
two players is zero: R receives and C receives — a^-. 

6.2. Example. Assume that each of Rowrnan and Columnman put down a £1 coin. The 
game is such that if both coins come down heads, or both come down tails, then Rowrnan 
wins both coins; if they disagree, then Columnman wins both coins. Describe this as a 
matrix game. 

Solution Let the table represent the possible outcomes as follows: 



H 

T 

H 

1 

-1 

T 

-1 

1 


T 

R’s coin 


C’s coin 


We have written the value to Rowrnan in each entry; this is also the loss to Columnman 
of the given outcome, and the interpretation of the table is exactly as required. 

Sometimes a game is easy to analyse. Consider the matrix game 

-1 1 4\ 

-3 0 l) ' 

Then Rowrnan is always better off choosing row 1, while Columnman always does better 
by choosing column 1, and so we expect the game to result in a payoff to Columnman of 1. 


65 








66 


CHAPTER 6. THE THEORY OF GAMES 


But the choices are not always this easy; our aim is to come up with convincing strategies 
in more general situations. 

6.2 Pure strategies 

When such a game is played just once, the possible strategies are easy to analyse. 

6.3. Definition. Let A = {a. t j\ and let 

a = max min a^-, f3 = min max . 

i j j i 

Then a and (3 are the lower and upper values of the matrix game A. 

It is worth examining briefly what a represents; this is the game as seen by Rowman, 
who first looks at each row in turn, and finds the smallest entry (which is the worst outcome 
for him). This then is what he can expect if he takes a pessimistic view of the outcome (or 
in some circumstances, believes that Columnman is playing skillfully). He then decides to 
minimise his loss by finding which row has the largest such value; the least bad outcome as 
far as Rowman is concerned. We see then that Rowman can gain at least a, and possibly 
more. In the same way Columnman must pay /?, but possibly less. 

6.4. Theorem. The lower and upper values a and (3 of a matrix game satisfy a < /3. 

Proof. This is simply a case of using the symbols, and keeping an eye on what is required. 
The first statement below is trivial, as is the second: 

a t j < a t j for all i and j, 

min a. t j < a-ij for all i and j, 

j 

minajj < max a tJ for all i and j. 
j i 

At this stage, note that the left hand side does not depend on j, while the right hand side 
does not depend on i. Thus 

max min a*,- < min max a*,- for all i and j, 

i j j i 

a < (3. 

This is of course the required result. □ 

6.5. Definition. If R chooses a row a* such that rninj a t j = a then R uses a pure maxmin 
strategy. If C chooses a column a- 7 such that max* = (3 then C uses a pure minmax 
strategy. 

These strategies can be called conservative strategies. Player R is making certain 
that, of all the possible smallest winnings he can receive, he receives the largest. Whereas 
C is making certain that, of all the possible largest payments he may have to make, he pays 
out the least. 

/ 0 1 -3 4 \ 

6.6. Example. Let A = I 5 0 1 — 1 Compute a and [3. 

V 2 -2 0 3 / 


6.3. MIXED STRATEGIES 


67 


Solution For each row, we write down the minimum entry and for each column we write 
down the maximum entry. 


0 

1 

-3 

4 

Min 

-3 

5 

0 

1 

-1 

-1 

2 

-2 

0 

3 

-2 

Max 5 

1 

1 

4 


Thus a = max min ay = — 1 and (3 = 

min max a 

ij = 

1. So a 2 is the conservative strategy 


for R and either a 2 or a 3 is a conservative strategy for C. 

6.7. Definition. A matrix game is strictly determined if a = (3. The common value is 
the value of the game. 

There is some interest in the situation if a game is not strictly determined; the pessim¬ 
istic outcomes for Rowrnan and Columnman can’t both happen. 

6.8. Definition. An entry in the matrix A is a saddle point if it is smallest in its 
row and largest in its column. 

6.9. Theorem. A matrix game A is strictly determined if and only if A has a saddle 
point. 

Proof. Assume first that a = /3. We show there is an element of A which is both the 
smallest in its row, and the largest in its column. 

Even without the assumption that a = (3 there are indices io and jo such that 

a = max min = min a,; OJ < ai 0 j 0 < maxa^ = min max a^- = f3. 
i j j i j i 

Since we assume a = (3 it follows that each inequality above is an equality. Consider now 
the element ai 0 j 0 . Then 


(y. — mm a j q j — — max o., /jq — /!• 

3 i 

The first equality only holds if ai 0 j 0 is the smallest element in row 7o, while the second only 
holds if it is the largest in column jo- 

Conversely, assume that ai 0 j 0 is the smallest element in row i$. Then 

a i 0 j 0 = min ai 0 j < max min a,; ? = a. 

3 i 3 

Similarly aj wo > (3, so a > f3. In general we have a < (3 so in this case, a = (3. □ 


6.3 Mixed Strategies 

Games which are not strictly determined can be further analysed by considering what 
happens when R and C play a game many times. Instead of choosing the same row (or 
column) on each occasion, R and C can adopt “mixed” strategies and thereby improve 
their expected payoff. 


68 


CHAPTER 6. THE THEORY OF GAMES 


6.10. Definition. A strategy for the player R is a vector u in M m with Y^~i=\ u i = ^ an d 
u > 0. A strategy for the player C is a vector v in M n with Y^j=i v j = 1 an d v > 0. 

What we have in mind is that if R has strategy u then he chooses row a* with probability 
Ui. That is, over a series of games, he chooses a* with the frequency that this specifies. 
Similarly, if C has strategy v then he chooses column a J with the frequency given by the 
probability vj. 

6.11. Definition. For strategies u for R and v for C, the expectation £'(u,v) is given 
by 


m n 

E(u,v) = ^2^2 u iOijVj ■ 
i=l j=1 

The expectation gives the ’expected value’ of the payoff that R receives from C. Using 
matrix notation, one has E(u, v) = u T Av. Note that Rowrnan is sure to win min v E( u, v) 
if he uses mixed strategy u. If he now maximises his certain winnings by choosing u sensibly, 
he can be sure of winning 


a* = max min E( u, v). 

U V 

In the same way, Columnman, using strategy v, will lose at most max u E(u, v), and he can 
reduce this loss as much as possible by choosing his strategy v sensibly; he then looses 

/ 3* = min max E(u, v). 

V U 


6.12. Definition. Let 

a* = max min E(u, v) and (3* = minmaxFl(u, v). 

U V V u 

Then a* and (3* are the optimum lower and optimum upper values of the game A. 
Any u such that 


min E(u, v) = a* 

V 

is an optimum strategy for R and any v such that 

maxF(u,v) = (3* 

U 

is an optimum strategy for C. 


6.4 The Fundamental Theorem 

6.13. Theorem. For any matrix game, a* = (3* and the common value is the value of 
the game. Thus there is an optimum strategy u* for R and an optimum strategy v* for C 
such that E( u*, v*) = a* = (3*. 


6.4. THE FUNDAMENTAL THEOREM 


69 


Proof. Note first that the strategy of the game is unaffected by adding the same constant k 
to each element of the matrix A; in such a case, the payoff to Rowman is simply increased 
by k. So there is no loss of generality in choosing such a k in such a way that A > 0; we 
assume that this has been done in what follows. Assume now that Rowman uses strategy 
u, and that Columnman uses the pure strategy j. by which we mean that Columnman 

m 

always chooses column j. This combination means that Rowman’s winnings are y^ ajjUj ; 

1=1 

and the winnings always exceed a threshold M. whichever choice Columnman makes if 

m 

y] ciijUi > M for j = 1,... , n. 
i =1 


Thus we have u t A > M lA, with u > 0 and lAu = 1. Note that by our positivity 
assumption on A, necessarily M > 0. We now rescale, to write Wi = Ui/M, so Wi > 0, 
A 1 w > 1 and l 3 w = M 1 = z (say). We are interested in the maximum guaranteed 
return, so we need to maximise M, and this is clearly the same as minimising 2 . Rowman’s 
problem is then the linear programming problem 

minimise l r w subject to A 3 w > 1 and w > 0. 

Before going any further, we note that this is in the form to be the dual problem of a 
linear programming problem, namely the problem: 

maximise l T x subject to Ax < 1 and x > 0, 


in which c = 1 and b = 1 , and we note that the first vector is of length n and the second 
of length m. 

Now consider the situation from the point of view of Columnman, who wishes to minim¬ 
ise his losses. Assuming that Rowman chooses pure strategy i and that he uses the mixed 

m 


strategy v, so v > 0 and l 3 v = 1, his expectation is to lose 
M' , whatever choice Rowman makes if 


aijVj. He will lose at most 
3 = 1 


m 

ciijVj < M' for i = 1,... , m. 
3 = 1 


Clearly Columnman wishes to make M' as small as possible. Again rescaling, we write 
xj = Vj/M 1 , and we note, as above that M' > 0. The Columnman’s problem is to 
minimise M' . or maximise 1/M' = z'\ again we can phrase this as a linear programming 
problem, specifically, Columnman wishes to 

maximise l r x subject to Ax < 1 and x > 0, 


which is of course the dual of Rowman’s problem. 

Using pure strategies, since M > 0 and M' > 0, we see there is necessarily a feasible 
solution to each problem; hence by the Duality Theorem, Theorem 5.6, the primal and dual 
problems both have optimal solutions with the same optimal value, and max z! = min 2 , or 


(3* = min M' = max M = a*. 


□ 


70 


CHAPTER 6. THE THEORY OF GAMES 


The value of a matrix game and optimum strategies for the players can be found using 
the simplex algorithm. The method is to solve Columnman’s problem: maximise l T x 
subject to Ax < 1, x > 0. 

As a first step, one must make A positive by adding a suitable constant k to each entry 
of A. Then take an initial tableau as given in Table 6.1 and use the simplex algorithm to 



n columns 

m columns 

e 

b 




0 

1 

m 



0 

1 

rows 

A 

Im 

0 

1 

e 

-1 -1 ... -1 

0 0 ... 0 

1 

0 


Table 6.1: Initial tableau for Columnman’s problem, 
obtain a final tableau show in Table 6.2. 



n columns 

m columns 

e 





0 


m 



0 


rows 



0 

X 

e 

> 0 > 0 ... >0 

w 

1 

z 


Table 6.2: Final tableau for Columnman’s problem. 

Note that the w which appears under the columns corresponding to the slack variables 
is the optimal solution of the dual problem, as explained in the proof of the Fundamental 
Theorem of Duality, Theorem 5.6, as shown specifically in Table 5.2. 

The value of the game is given by cc* = (3* = 1/z — k, and optimum strategies for R 
and C are given by u* = (l/z)w and v* = (l/z)x 

Notice that x is not actually the last column of the final tableau as it stands. For x, one 
must take the whole n-vector that gives the solution to the linear programming problem 
(denoted in Chapter 4 by x = (x\,X 2 , ■ ■ ■ ,x n )) and not the m- vector given by the values 
conventionally denoted by ui,U 2 , ■ ■ ■ ,u m . This is clarified in the example that follows. 
6.14- Example. Solve the matrix game given by 



It is worth checking to see if the game is strictly determined. We have a = — 1 and 
f3 = 2, so it is not. 

Begin by adding k to each entry with say k = 2. This ensures there are feasible solutions 
of the linear programming problem, as described in the proof of the theorem. The initial 
tableau is given in Table 6.3 
















6.4. THE FUNDAMENTAL THEOREM 


71 



a i 

a 2 

a 3 

a 4 

a 5 

e 

b 

a 4 

1 

2 

4 

1 

0 

0 

1 

a 5 

6 

4* 

1 

0 

1 

0 

1 

e 

-1 

-1 

-1 

0 

0 

1 

0 


Table 6.3: Initial tableau for Example 6.14. 


We now pivot about the entry indicated. Note that in principle we could have swapped 
any of ai, a 2 or a 3 into the basis; we have made the choice more or less at random. Doing 
the pivot gives the second tableau shown in Table 6.4. 



a i 

a 2 

a 3 

9.4 

a 5 

e 

b 

9-4 

-2 

0 

(7/2)* 

1 

-1/2 

0 

1/2 

a 2 

3/2 

1 

1/4 

0 

1/4 

0 

1/4 

e 

1/2 

0 

-3/4 

0 

1/4 

1 

1/4 


Table 6.4: Second tableau for Example 6.14. 

This time there is only one proper sign for improvement; pivoting as shown, we obtain 
the final tableau in Table 6.5 



a i 

a 2 

a 3 

a 4 

a 5 

e 

b 

a 3 

-4/7 

0 

1 

2/7 

1/7 

0 

1/7 

a 2 

23/14 

1 

0 

-1/14 

2/7 

0 

3/14 

e 

1/14 

0 

0 

3/14 

1/7 

1 

5/14 


Table 6.5: Final tableau for Example 6.14. 

So z = jj, and we can read off the optimal solutions for both the initial problem and 
the dual; they are 


w = 



and 


, 3 1 

X= l°'l4'7 


We thus get the value of the game as 


a 


* 



where of course we now remove the constant k added on above. The optimum strategy for 
Columnman is then given by 


1 

v = —x = 

z 


14 

T 

























72 


CHAPTER 6. THE THEORY OF GAMES 


Here of course we have reported the full solution x, rather than the basic solution cor¬ 
responding to the last column of the matrix. Similarly the optimal strategy for Rowrnan 
is 


1 

u = —w = 

z 


14 

T 




Of course we can work without applying the “shift” by k to the game. Here is the 
calculation; if anything it is slightly easier than before. 

> with (linalg): 

> A:=matrix(3,7, [-1,0,2,1,0,0,1,4,2,-1,0,1,0,1,-1,-1,-1,0,0,1,0]) ; 




-1 

0 

2 1 

0 

0 

1 


A : = 

4 

2 

-1 0 

1 

0 

1 



-1 

-1 

-1 0 

0 

1 

0 

> B:=mulrow(A,1, 

1/2):B: 

=pivot(B, 

1,3); 






' -1 
~Y 

0 

1 

1 2 

0 

0 

1 

2 


B : = 

7 

2 

2 

0 - 
2 

1 

0 

3 

2 



-3 

. T 

-1 

1 

0 2 

0 

1 

1 

2 

> C:=mulrow(B,2, 

1/2):C: 

=pivot(C,2,2); 






' -1 
IT 

0 

1 

1 - 
2 

0 

0 

1 

2 


C : = 

7 

4 

1 

1 

0 4 

1 

2 

0 

3 

4 



1 

0 

CO | ^ 

o 

1 

2 

1 

5 

4 


However, we have no guarantee that this calcualtion is relevant to our original problem; 
indeed if we end up with an objective value which is zero, or even nagtive, it isn’t useful 
as you can quickly see if you try to translate back to the original problem. So the original 
shift by k is important! 

6.15. Example. Solve the matrix game given by 



Solution As before we use the fundamental theorem of Game theory. The full set of tables 
are given in Table 6.6. 

In this example, we have z = jq, and we can read off the optimal solutions for both the 
initial problem and the dual; they are 








6.5. QUESTIONS 5 


73 



a i 

a 2 

a 3 

a 4 

a 5 

e 

b 

a 3 

4* 

2 

1 

0 

0 

0 

1 

a 4 

1 

3 

0 

1 

0 

0 

1 

a 5 

3 

4 

0 

0 

1 

0 

1 

e 

-1 

-1 

0 

0 

0 

1 

0 

a i 

1 

1/2 

1/4 

0 

0 

0 

1/4 

9.4 

0 

5/2 

- 1/4 

1 

0 

0 

3/4 

a 5 

0 

5*/2 

- 3/4 

0 

1 

0 

1/4 

e 

0 

-1/2 

1/4 

0 

0 

1 

1/4 

a i 

1 

0 

2/5 

0 

- 1/5 

0 

1/5 

3.4 

0 

0 

1/2 

1 

-1 

0 

1/2 

a 2 

0 

1 

- 3/10 

0 

2/5 

0 

1/10 

e 

0 

0 

1/10 

0 

1/5 

1 

3/10 


Table 6.6: Complete tableau for Example 6.15. 


The value of the game is a* = (3* = 10 / 3 ; the optimum strategy for Columnman is then 
given by 

v = l x= i?(T±W?.iY 

while for Rowman it is 

u = -w = — | —, U. — I = I —, U, — 

z 3 V 10 5 / \ 3 ’ 3 


5 10 


3 3 


10 ( 1 


6.5 Questions 5 (Hints and solutions start on page 120.) 

6.1. Q. Rowman and Columnman play the following game in which they conceal and then display 
coins. Each first hides in his hand a “stake” of zero, one or two £1 coins. They then reveal what 
is in their hand at the same time. If both players reveal the same number of coins, Rowman pays 
to Columnman the total number of coins both have shown; if the numbers of coins are different, 
Columnman pays Rowman the total number of coins both have shown. Write down the payoff 
matrix for Rowman and show that the game is not strictly determined. 

State the linear programming problem that Columnman has to solve to find his optimal strategy. 
Assuming that the corresponding final tableau is 


Basis 

&i 

a 2 

a 3 

a 4 

a 5 

a 6 

e 

b 

a 2 

0 

1 

0 

21 

76 

-9 

38 

1 

76 

0 

1 

19 

a 3 

0 

0 

1 

23 

152 

1 

76 

-17 

152 

0 

1 

19 

a l 

1 

0 

0 

-49 

152 

21 

76 

23 

152 

0 

2 

19 

e 

0 

0 

0 

2 

19 

1 

19 

1 

19 

1 

4 

19 


determine the optimal strategies for Rowman and Columnman and derive the value of the game. 
You are not asked to derive the final tableau yourself. 

















74 


CHAPTER 6. THE THEORY OF GAMES 


6.2. Q. The game of “matrix hide and seek” is played between players H and S. Player H first 
chooses an entry in the matrix 


12 3 
2 4 3 

and “hides” at that position. Player S then “seeks” by choosing either one of the two rows or one 
of the three columns, so has a choice from 5 selections in all. If H is hiding in the row or column 
selected by S then H pays £k to S, where k is the entry in the matrix at which he “hid”. Otherwise 
S pays H £1. Write down the payoff matrix for S and show that the game is not strictly determined. 

6.3. Q. Suppose that two matrix games are given by 

/ 4 2\ /2 6 

A = | 8 6 B = 8 4 

\2 4/ V 4 2 

where the matrices give the payoff to Rowman. Obtain the upper and lower values a and /3 for each 
game, and say which game is strictly determined. 

Use the simplex method to solve the other game, showing that the value of the game lies between 
a and /3. Give the optimum strategy for each of the two players. 

If you were Rowman, would you, given the choice, play game A or game B? Why? 

6-4- Q- Consider the matrix game given by 


/ 3 2 
A = 2 6 

\1 5 

where the matrix gives the payoff to Rowman. Obtain the upper and lower values a and (3 for this 
game, and show that the game is not strictly strictly determined. Explain which of Rowman’s pure 
strategies will clearly not feature in his optimal strategy. 

Express Columnman’s problem as a linear programming problem. Write down the initial tableau 
for its solution and apply the Simplex Algorithm to derive a final tableau. 

One way of doing this leads to the final tableau shown in Table 6.7. Using this final tableau 
state the value of the game, and give optimal strategies for Rowman and Columnman. 


Basis 

ai 

&2 

a 2 

9.4 

a 5 

e 

b 

ai 

1 

0 

3/7 

-1/7 

0 

0 

2/7 

&2 

0 

1 

-1/7 

3/14 

0 

0 

1/14 

a 5 

0 

0 

2/7 

-13/14 

1 

0 

5/14 

e 

0 

0 

2/7 

1/14 

0 

1 

5/14 


Table 6.7: One possible final tableau for Question 6.4. 








Chapter 7 


Non-linear Optimisation Problems 


7.1 Relaxing linearity conditions 

We have seen in the preceding sections how the assumption of linearity, both of the ob¬ 
jective function and the constraints, allowed a very effective optimisation procedure to 
be developed. In the remainder of these notes, we explore the situation in which these 
restrictions no longer hold. In other words, we consider the problem 

maximise /(x), subject to constraints Cj (x) = 0 for (1 < j < k). 

In this generality, nothing useful can be said. The situation is sometimes summed up by 
saying that both the objective function and the constraints are non-linear and is referred 
to as a non-linear optimisation problem. However the term “non-linear” should not be 
thought of a descriptive, but simply an indication of what fails. To make the point more 
directly, it may be quite reasonable to study (say) bananas, but what does it mean to study 
“non-bananas”? An apple is a non-banana, but so is a University course in optimisation! 

Here is an example of what looks like a very simple problem. You are given n distinct 
points in the plane, say ( Xi,yi ) for 1 < i < n and have to minimise the objective function 

n 

f(x, y) = Yl V(x - %i) 2 + (y - Vi) 2 

i= 1 

Clearly such a minimum exists. You may wish to experiment numerically in the particular 
case when the three points are at (0,0), (1,0) and (0,2). You should find, a unique optimal 
solution 


(x*,y*) = (0.3045036,0.2545693) 

to seven figures. Although it looks simple, in general there is no explicit solution to this 
problem; note that in the example above, the optimal solution is surprisingly far away from 
the centre of mass (0.333,0.666) of the two points. 

This problem is known as the Fermat-Weber problem; you can find more information 
in Kaplan (1999). It is more obviously practical if for example you consider the objective 
function as the cost of the water pipes which have to be laid in order to connect each house 
(at position ( Xi,yi )) with water from central supply located at (x,y). In this guise, the 
problem is also known as the facility location problem. 


75 



76 


CHAPTER 7. NON-LINEAR OPTIMISATION PROBLEMS 


In the remainder of this section we take up the study of non-linear optimisation prob¬ 
lems in general, but our motivation is to study one of the simplest possible extensions of 
the linear problem, namely the situation when the objective function, but not the con¬ 
straints, is allowed to have second order, or quadratic terms as well as linear terms. Then 
in Chapter 8 we take a very different view, and indicate some modern approaches that can 
be useful for very general optimisation problems. Finally in Chapter 9 we investigate an 
intermediate technique which is probabilistic, and so needs modern computing tools, yet 
builds on much of the traditional technique. A good overview of the variety of methods 
for such problems can be found in Michalewicz & Fogel (2000). I recommend this quite 
strongly as an interesting and slightly eccentric book. 1 

It is tempting to say there are no general methods, simply a collection of techniques 
which have been explored for particular types of problems. However there is of course a 
reliance on simpler techniques we have already studied, and this suggests one approach to 
more general problems: 

• extract one or more linear subproblems; or 

• convert the original problem to an unconstrained one, for which simpler methods are 
available. 

Our first step is to note that certain types of constrained optimisation problems are 
familiar from calculus, and indeed that we traditionally pass to the unconstrained situation. 


7.2 Maxima and Minima 


As in one variable calculations, one use for derivatives in several variables is in calculating 
maxima and minima. Again as for one variable, we shall rely on the theorem that if / is 
continuous on a closed bounded subset of R 2 , then it has a global maximum and a global 
minimum. And again as before, we note that these must occur either at a local maximum or 
minimum, or else on the boundary of the region. Of course in R, the boundary of the region 
usually consisted of a pair of end points, while in R 2 , the situation is more complicated. 
However, the principle remains the same. And we can test for local maxima and minima 
in the same way as for one variable. 


7.1. Definition. Say that f(x,y) has a critical point at (a, 6) if and only if 

df ( ,x <9/ 

-(<,.(,) =-(a,6) =°. 


It is clear by comparison with the single variable result, that a necessary condition that 
/ have a local extremum at (a, b ) is that it have a critical point there, although that is not 
a sufficient condition. We refer to this as the first derivative test. 

We can get more information by looking at the second derivative. Recall that we gave 
a number of different notations for partial derivatives, and in what follows we use f x rather 

df 

than the more cumbersome —— etc. This idea extends to higher derivatives; we shall use 

ox 


f xx instead of 


d 2 / 

dx 2 


and f xy 


instead of 


d 2 f 

dxdy 


etc. 


x It has chapter headings such as “Who Owns the Zebra?” and “ What’s the Color of the Bear?”. 





7 . 2 . MAXIMA AND MINIMA 


77 


7.2. Theorem (Second Derivative Test). Assume that ( a,b ) is a critical point for f. 
Then 

• If, at (a, b), we have f xx < 0 and f xx fyy — f xy > 0, then f has a local maximum at 

( a,b )• 

• If, at (a, b), we have f xx > 0 and f xx f yy — f xy > 0, then f has a local minimum at 
(a, 6). 

• If, at ( a,b), we have f xx f yy — f xy < 0, then f has a saddle point at ( a,b ). 

The test is inconclusive at (a, b ) if fxxfyy — f xy = 0, and the investigation has to be 
continued some other way. 

Note that the discriminant is easily remembered as 

= fxxfyy — fxy 

A number of very simple examples can help to remember this. After all, the result of the 
test should work on things where we can do the calculation anyway! 

7.3. Example. Show that f(x, y) = x 2 + y 2 has a minimum at (0, 0). 

Of course we know it has a global minimum there, but here goes with the test: 
Solution We have f x = 2x; f y = 2 y, so f x = f y precisely when x = y = 0, and this is the 
only critical point. We have f xx = f yy = 2; f xy = 0, so A = f xx f yy - f xy = 4 > 0 and there 
is a local minimum at (0, 0). 

7.4- Example. Let f(x,y) = xy. Show there is a unique critical point, which is a saddle 
point. 

Solution f x = y, f y = x, and so there is a critical point only at (0, 0). Also f xx = 0 = f yy \ 
fx y = 1, so A = —1 and indeed we have a saddle point at (0,0). 


A = 


fxx fxy 
fyx fyy 


Proof. We give an indication of how the theorem can be derived — or if necessary how it 
can be remembered. We start with the two dimensional version of Taylor’s theorem. We 
have 


/(a + h,b + k) 


/(a, b) + h^(a, b) + k^(a, b) + ^ 


h 2 ^ + 2kh 

ox z 


d 2 f 


+ k 


,d 2 f 


dxdy dy 


where we have actually taken an expansion to second order and assumed the corresponding 
remainder is small. 

df df 

We are looking at a critical point, so for any pair (h, k), we have h-—(a , b) + k—- (a, b) = 

ox dy 

0 and everything hinges on the behaviour of the second order terms. It is thus enough to 
study the behaviour of the quadratic Ah 2 + 2 Bhk + Ck 2 , where we have written 


d 2 f 

A — J — f 

o — Jxxi 
OX z 


B = 


d 2 f 

dxdy 


= /. 


xy. 


d 2 f 

and C = — T = fyy. 
dy z 










78 


CHAPTER 7. NON-LINEAR OPTIMISATION PROBLEMS 


Assuming that 4/Owe can write 


Ah 2 + 2Bhk + Ck 2 = A 
= A 




k 2 



k 2 


where we write A = CA — B 2 for the discriminant. We have thus expressed the quadratic 
as the sum of two squares. It is thus clear that 

• if A < 0 and A > 0 we have a local maximum; 


• if A > 0 and A > 0 we have a local minimum; and 


• if A < 0 then the coefficients of the two squared terms have opposite signs, so by 
going out in two different directions, the quadratic may be made either to increase or 
to decrease. 


Note also that we could have completed the square in the same way, but starting from the 
k term, rather than the h term; so the result could just as easily be stated in terms of C 
instead of A □ 

7.5. Example. Find the extrema of f(x,y) = xy — x 2 — y 2 — 2x — 2y + 4 in the positive 
quadrant. 

Solution Since f x = y — 2x — 2 and f y = x — 2 y — 2 , we have a critical point when 
F - x = f y = 0, or (x,y) = (2,2). Then f xx = -2; f yy = -2, and f xy = 1, and so 
A = fxx fyy ~ fxy = 4— 1 = 3 > 0. Since f xx < 0, we have a local maximum; in fact it is a 
global maximum. 

Note that, had the objective function been linear, rather than of at most second order, 
this would have been a very simple linear programming problem with no constraints. We 
now consider optimisation problems with additional constraints. 

7.6. Example. An open-topped rectangular tank is to be constructed so that the sum of 
the height and the perimeter of the base is 30 metres. Find the dimensions which maximise 
the surface area of the tank. What is the maximum value of the surface area? [You may 
assume that the maximum exists, and that the corresponding dimensions of the tank are 
strictly positive.] 

Solution Let the dimensions of the box be as shown in Fig 7.2, and let the area of the 
surface of the material be S. Then 


S = 2xh + 2 yh + xy, 
and since, from our restriction on the base and height, 

30 = 2(x + y) + h, we have h = 30 — 2{x + y). 
Substituting, we have 

S = 2(x + y) (30 - 2(x + y)j + xy = 60(x + y) - 4(x + y) 2 + xy, 


7.3. LAGRANGE’S PRINCIPLE 


79 



Figure 7.1: A dimensioned box 


and for physical reasons, S is defined for x > 0, y > 0 and x + y < 15. 

A global maximum (which we are given exists) can only occur on the boundary of the 

dS dS 

domain of definition of 5, or at a critical point, when —— = —— = 0. On the boundary of 

ox oy 

the domain of definition of S, we have x = 0ory = 0oTx + y = 15, in which case h = 0. 
We are given that we may ignore these cases. Now 


S = 
95 _ 

dx 
95 _ 

dy 


—Ax 2 — 4 y 2 — 7 xy + 60x + 60y, so 


—8 x — 7y + 60 = 0, 
—8 y — 7x + 60 = 0. 


Subtracting gives x = y and so 15x = 60, or x = y = 4. Thus h = 14 and the surface 
area is 5 = 16(—4 — 4 — 7 + 15 + 15) = 240 square metres. Since we are given that a 
maximum exists, this must be it. [If both sides of the surface are counted, the area is 
doubled, but the critical proportions are still the same.] 


7.3 Lagrange’s Principle 

We saw that one way to solve a constrained optimisation problem was to use the constraints 
to eliminate some of the variables, and so reduce to an unconstrained problem using fewer 
variables. This is effective when it works, but although the elimination is almost always 
possible in principle, 2 in practice we may not be able to get the explicit formula need to 
carry on with an analytic solution. A more organised way of solving such problems uses 
Lagrange’s principle. 

7.7. Theorem (Lagrange’s Principle). Suppose there are constants Ai, A 2 , • • •, A k 
such that xo G W 1 is an unconstrained maximum of 

k 

/( x ) - 5Z A i c i(x) 

3 =1 

2 It can be done iff the conditions of the implicit function theorem hold. 








80 


CHAPTER 7. NON-LINEAR OPTIMISATION PROBLEMS 


and that in addition cj(xq) = 0 for 1 < j < k. Then xq is a solution of the constrained 
maximisation problem: 

maximise /(x) subject to Cj(x) = 0 for 1 < j < k. 

Proof. Suppose we have such a local maximum xq. Then there is a neighbourhood U of xo 
such that 


k k 

/( x o) = /(x 0 ) - Y A i c J'( x o) > /( x ) - Y A J C J'( X ) (x E 17). 
i=i i=i 

If we now restrict attention to those points in U for which Cy(x) = 0 for (1 < j < A:), we 
have Yl k j=i A i c i( x ) = 0) and so 

/(xo) > /(x) (x£f/ and Cj(x) = 0 for 1 < j < /c). 

Thus xo is a constrained maximiser of /. □ 

7.8. Remark. The same argument shows that if there is a global maximum of the uncon¬ 
strained problem which satisfies the same conditions, this will gave a global constrained 
maximum. Replacing / by — / gives the same results for minima. 

In practice it is relatively easy to apply this principle, provided we have enough differ¬ 
entiability to allow calculus-based optimisation. Let 


k 

L(x) = L(x, A) = /(x) - Y A j c j(x) 

3 = 1 

and say that L is the Lagrangian function. We seek values xo E M n and A = (Ai, 
A 2 , ■ ■ ■, Afc) such that L(x) has a local maximum and in addition Cj(x 0 ) = 0 for 1 < j < k. 
One way to do this is to look for critical points — those points at which 

d L 

wr° 


Note also that 

dL dL 

— = Cj(x) (1 <j<k) and so cj(x) = 0 (1 < j < k) iff — = 0 (l<j<k). 

Thus if we require that L(x, A) has a critical point, as a function of both x and A, or 
equivalently that 

VL(x, A) = 0, 

we have a point at which we can potentially apply Lagrange’s principle. Formally we still 
have more checking to do, but usually the hard part is to find the point in the first place. 

7.9. Example. Use Lagrange’s principle to re-work Example 7.6. 


7.4. INEQUALITY CONSTRAINTS 


81 


Solution We use Lagrange’s method to derive a potential critical point. We are required 
to maximise S = 2xh + 2 yh + xy subject to the constraint that h + 2(x + y) = 30. Let 


L(x, A) = 2 xh + 2 yh + xy — A (h + 2x + 2y — 30). 


Then at a critical point of L(x, A) we have 


dL 

dx 

dL 

~dh 


2h + y — 2A = 0, 
2x + 2 y — A = 0, 


dL 

dy 

dL 

~dX 


2h + x — 2A = 0, 
h + 2x + 2y — 30. 


The third equation gives A = 2(x + y), while from the first two, y — x = 0, so x = y. Thus 
A = 4x and 2 h = 7x. Using the constraint we see that 7x/2 + 4x = 30, so x = 4. Thus 
there is a single critical point of L , which occurs at x = y = 4 and h = 14. Finally we note 
that this solution satisfies the (implicit) reality constraints, that x > 0, y > 0 and h> 0. 


7.10. Example. Assume now that the open topped box of Example 7.6 is to have a fixed 
surface area. Find the proportions so that the volume be a maximum. 

Solution We are required to maximise V = xyh subject to the constraint that the surface 
area 2 h(x + y) + xy is some constant, say K. Let 

L(x, A) = xyh — A(2 h(x + y) + xy — K). 


Then at a critical point of L(x, A) we have 


dL 

dx 


yh — 2h\ — y A = 0, 


dL 
~dh' 


= xy - 2X(x + y) = 0, 


dL 

dy 

dL 

5A 


xh — 2hX — xX = 0, 

2 h(x + y) + xy — I\ = 0. 


The first two equations give, (y — x)h = A (y — x), so x = y or h = A. If h = A, the first 
equation shows that hX = 0. We exclude h = 0 on physical grounds, because we know the 
maximum volume will be strictly positive. Thus if hX = 0 we must have A = 0 and the first 
equation then shows that hy = 0, which we exclude for the same reason. Thus we must 
have x = y. 

The third equation now gives A = x/4; we then solve the first equation to see that 
h = y/2. Thus there is a local constrained maximum in the volume when the box is square 
with height half that of the length of the side. 


7.4 Inequality Constraints 

We now consider the more general situation in which the constraints are allowed to be 
inequalities rather than equalities. Of course we could use the usual “slack variables’ tech¬ 
nique to stick with “equality” constraints, but there are advantages in working with the 
minimum number of variables. We thus consider problems of the form: 

minimise /(x) subject to Cj(x) > 0 for 1 < j < k. (NLCO) 


82 


CHAPTER 7. NON-LINEAR OPTIMISATION PROBLEMS 


A typical example might be the pair 

ci(x) = 1 — x\ - x%, c 2 (x) = x 2 - x\. 

The points satisfying this constraint lie within the unit sphere and above the parabola 
X 2 = xf. It is clear how we can minimise an affine function geometrically subject to these 
constraints. 

7.11. Definition. We say the j th constraint is active or tight at x if Cj(x) = 0. Otherwise 
we say the constraint is slack. 

Let x* be a solution of the constrained non-linear optimisation problem (NLCO). We 
find it convenient to describe the set of tight constraints, so let I = {1, 2 ,... , k} and let 

I* = {j e I \ Cj (x*) = 0} 

so that I* is the set of indices of tight constraints. Thus x* minimises /(x) with Cj(x*) = 0 
for j € I*. 

We now ignore for the moment the fact that, until we have found x* we know nothing 
about the set /*; in that case, we are trying to solve a constrained maximisation principle 
with equality constraints, and so can use Lagrange’s principle, via the Lagrangian, and seek 
x* and Xj ( j G I*) at points where 


VL(x, A) = V 



= 0 


Now define Xj = 0 for j e I\I*. This means that the “full Lagrangian” / — ^j c j is 

defined, and we have 


VL(x, A) 



(x*) = 0 


since the extra terms are identically zero. We thus have XjCj(x*) = 0 for all j, since either 
Xj = 0 for slack constraints, or c j(x*) = 0 for active constraints. 

We now consider the sign of Xj, giving an heuristic justification for the fact that Xj > 0 
for all j. Recall that for any function g(x), the derivative V g is a vector which points in the 
direction of increasing values of g. Thus if Cj is a tight constraint, and x* is the minimiser, 
Vcj(x)| x=x * points in the direction of increasing Cj and hence into the feasible region. 
Assume now that for the minimiser x* the only tight constraint is Cj. Since VL(x, A) = 0, 
we have 

n 

V/(x*) = ^AjCj(x*) = XjVcjix*). 

3 =i 


If Xj < 0, V/(x*) points out of the feasible region, and hence we can reduce / by moving 
into the feasible region. This contradicts the fact that x* is a local minimiser for /. 

The argument is more complicated if two or more constraints are tight at the same 
time, but this gives the flavour. What we have established is that at x* the Kuhn-Tucker 
conditions hold. 


7.4. INEQUALITY CONSTRAINTS 


83 


7.12. Definition (Kuhn-Tucker Conditions). The following three sets of conditions 
make up the Kuhn-Tucker conditions: 

• V “ XJ Vjj ( x ) = 0; 

• Cj(x) > 0 and A j >0 (1 < j < k)\ and 

• Ajc(x) =0 (1 < j < k). 

We have intentionally not been precise about the differentiability conditions needed to 
get these results. Formally we have the following theorem 

7.13. Theorem (First Kuhn - Tucker). Let x* be a local minimiser for the problem 

minimise /(x) subject to Cj(x) > 0, (1 <j<k). 

Then provided the objective function f and the constraints Cj are (twice) continuously dif¬ 
ferentiable, and the set 


{Vc,-(x*)| jel*} 


in linearly independent in there are constants Ai, A2, ■ ■ ■, A & such that the Kuhn-Tucker 
conditions are satisfied. 


7.14- Remark. The derivative condition is almost always satisfied, only being violated when 
two constraints happen to be tangential. 

7.15. Example. Minimise x 2 + 2 y 2 subject to x + y > 2 and x < y. 

Solution At this stage, we simply seek possible local minimisers. We have 

L(x, y , A, p) = x 2 + 2y 2 - \(x + y - 2) - p(y - x). 


where we have arranged the constraints and the signs of A and p to be consistent with our 
definition of the “full Lagrangian”. The Kuhn - Tucker conditions thus become: 


— = 2x — X + p = 0, 

x + y> 2, 
A > 0, 
A(x + y — 2) = 0, 


— = 4y — A — /x = 0, 
oy 

y-x > 0, 
/i > 0, 
p(y - x) = 0. 


(7.1) 

(7.2) 

(7.3) 

(7.4) 


We now solve this set of constraints. If A = 0 we have 2x + p = 0 and 4 y = p. Thus 
x + y = —p/2 + p/4 = —p/4. Since we know that p > 0, we see that x + y < 0 and so 
cannot satisfy the constraint x + y > 2. It follows that A 7^ 0. 

Next consider what happens if we assume that p = 0. Then arguing as before, we get 
2x = A and 4y = A so that x = A/2 and y = A/4. Since p > 0, we cannot satisfy the 
constraint y — x > 0. It follows that p / 0. 

We have now shown that each constraint is tight, so we are, in effect solving the simpler 
problem in which the constraints are equalities. We have x = y, and then x = y = 1, which 
is in fact a global minimiser. 


84 


CHAPTER 7. NON-LINEAR OPTIMISATION PROBLEMS 


7.16. Example. How does the above example change when the second constraint becomes 
ay > x for some a > 2 ? 

Solution The example is of course very similar, until we come to examine whether the 
constraints are tight or not. We have 


L(x , y, A, p) = x 2 + 2y 2 - \{x + y - 2) - p(ay - x). 


where we have arranged the constraints and the signs of A and p to be consistent with our 
definition of the “full Lagrangian”. The Kuhn - Tucker conditions thus become: 


— = 2x-X + p = 0, 
ox 

x + y > 2 , 
A > 0, 
X(x + y-2) = 0 , 


dL 

dy 


Ay — A — ay = 0, 
ay — x > 0 , 

E > o, 

p(ay — x) = 0 . 


(7.5) 

(7.6) 

(7.7) 

(7.8) 


We now solve this set of constraints. If we assume that p = 0, we get 2x = X and Ay = X so 
that x = A/2 and y = A/4. Since x + y > 2, we cannot have A = 0, so must have x + y = 2, 
and thus A = 8/3, x = A/3 and y = 2/3. Since a > 2 this satisfies the second constraint, 
and the corresponding value of the objective function is 32/9. 

If A = 0 we have 2x + p = 0 and 4 y = ap. In fact this gives a valid solution if a > 2, 
but we can rule it out immediately, since we have x < 0. Since x + y = 2, we have y > 2 
and the objective value is at least 4, and so not a minimum value. 


7.5 Convexity 

We conclude this section by giving a geometrical condition which enables us to guarantee 
solutions are available using the above techniques. One way to guarantee good behaviour 
is to work with convex functions. A convex set is geometrically very simple. A set C is 
convex if x 1 E C, x 2 E C means that Ax 1 + (1 — A)x 2 E C whenever 0 < A < 1. In other 
words, if any line joining two points in c lies entirely in C. 

You may already have met a convex function — one in which the points above the graph 
of the function (the “supergraph”) form a convex set. Equivalently, a function is convex if 
the line joining two points on the graph of / always lies above the graph of /. 

An obvious way to generate a convex set is as the intersection of a collection of half 
planes; hence the set of feasible solutions of a linear programming problem forms a (possibly 
empty) convex set. 

There are other very natural ways to generate convex sets. Let x 1 , x 2 , ... , x i: be 
elements of R n . It is essentially trivial to show that the set of all convex combinations of 
x 1 , x 2 , ... , x fc is a convex set. Here is the argument: let u = Aix 1 + A 2 X 2 + ... + A&x fc 
and v = pi* 1 + pi*- 2 + ... + PkX- k be two convex combinations of x 1 , x 2 , ... , x fc . Write 
down Au + (1 — A)v and show that, when 0 < A < 1, this is also a convex combination of 
X 1 , X 2 , ... , x fc . 

It is of interest that the set of optimal solutions of a linear programming problem is a 
convex set. The next two examples discuss this: 


7.5. CONVEXITY 


85 


7.17. Example. Show that if x 1 , x 2 , ... , x fc are optimal solutions of a linear programming 
problem, then any convex combination of x 1 , x 2 , ... , x k is also an optimal solution. 

Solution Let Aix 1 + A 2 X 2 + ... + A^x fc be any convex combination of x 1 , x 2 , ... , x k . Let 
the objective function be c\X\ + C 2 X 2 + ... + c n x n . Since x 1 , x 2 , ... , x fc all have the same 
objective value, z say, then, for all i, 

Cix\ + C 2 X 2 + ... + C n x l n = z , 

where x 4 = (x[,x l 2 ,... ,<). So (why ?), 

Cl(AixJ + . . . + \kx\) + . . . + C n (AiX 2 + . . . + A k%n) = Z 1 

which shows (why ?) that Aix 1 + A 2 X 2 + ... + AfcX fc also has objective value z. Hence the 
result. 

This solution can be written more simply by using matrix notation, letting the objective 
function be c T x. 


7.18. Example. Prove (by induction on k) that, if x 1 , x 2 , ... , x fc belong to a convex set C, 
then any convex combination of x 1 , x 2 , ... , x k belongs to C. 

Solution Let v = Aix 1 + A 2 X 2 + ... + A^x^ be any convex combination of x 1 , x 2 , ... , x 2 ’. 
Write v = (1 — Afc)u + A fcX fc , where 


u = 


1 - Afc 


X 1 + 


A2 


1 - Afc 


x + ... + 


1 — Afc 


1 x fc - 2 . 


By the induction hypothesis (why ?), u belongs to C and hence v belongs to C. 


We now return to our main thread; that convexity gives us a handle on uniqueness. 
It is geometrically clear that a (strictly) convex function cannot have two different local 
minima, while if the same value occurs as a local minimum at different places, the graph 
must be flat between them, so neither minimum is isolated (and / can’t be strictly convex). 

Say that / is concave if —/ is convex. To help remember the difference, note that x 2 
is convex, while — x 2 is concave. 

Taylors theorem in its general form allows us to analyse /. At the point x, we have 


/(x + h) = /(x) + h.V/(x) + — h r G(x)h + higher order terms 
exactly as in the two variable case. We have written 


G = 


/ d 2 f 

d 2 f 

d 2 f 

\ 


dx 2 

dxidxo 

dx±dx 



d 2 f 

d 2 f 

d 2 f 



8 x 28 x 1 

dx 2 

dx 2 dx 

a 


d 2 f 

d 2 f 

d 2 f 



\dx n dx\ 

dx n dx 2 

dx 2 

) 


, G(x) is 

symmetric and 

so it can 

be diagonalised 

of G(x) 

and write D(x) 

= D = 

diag(Xi, X 2 , ■■■ ,X n 

pT = p -1 

such that G(x) 

= p t dp , 

and we have 


/(x + h) 


/(x) + h.V/(x) + -h 2 P 1 D(x.)(P h) + higher order terms 

















86 


CHAPTER 7. NON-LINEAR OPTIMISATION PROBLEMS 


At a critical point, V/(x) = 0 and so 

/(x + h) = /(x) + i (Aiuf + \ 2 ul H-h A nU^) + higher order terms 

where Ph = u. Thus / is convex iff Aj > 0 for all i; we can read the behaviour of / from 
the set of eigenvalues of the Hessian. This leads to: 

7.19. Theorem (Second Derivative Theorem). Assume that f is twice differentiable 
on the convex set D. Then f is convex on D if its Hessian G is positive semidefinite, so 
x t Gx > 0 for each x. 

Note that if / is linear then it is both convex and concave. The function f(x) = 
x\ + ■ ■ ■ + x^ is strictly convex, since 



( 2 

0 .. 

. OX 

G = 

0 

2 .. 

. 0 


\o 

0 .. 

• V 


7.20. Theorem (Second Kuhn-Tucker). Let f : D — » W 1 be a convex function and 
suppose that each constraint Cj is concave on D. Assume that each of the functions involved 
is twice continuously differentiable. Then every solution of the Kuhn-Tucker conditions is 
a global minimum of the corresponding NLCO problem. 

7.6 Questions 6 (Hints and solutions start on page 124.) 

7.1. Q. Describe Lagrange’s principle briefly, and illustrate your answer by obtaining the maximum 

value of x\ + • • • + x\ subject to the constraint that *i + - 1 - x n = 1 . 

7.2. Q. a) Solve the optimisation problem: 

minimise x\ + x\ + x% subject to 

X\ + 2X2 + * 3 = 4 , 

*i — 2 x 2 + 2*3 = 17. 


b) Write down the Kuhn-Tucker conditions for the problem: 

minimise x\ + x\ + x\ , subject to 

*1 + 2*2 + *3 > 4, 

*1 — 2*2 + 2*3 < 17. 

Find the global minimiser. 

7. 3. Q. The function z = *i — * 2 is to be minimised subject to the constraints that 

*i + *2 < 4, 

2x\ — 3*2 < 0. 

Sketch the feasible region. Hence obtain geometrically the optimal value and the corresponding 
values of *i and *2 . 

Write down the Kuhn-Tucker conditions for this problem. Illustrate their use by deriving the 
solution you have just obtained. What information do you get about the tightness of the constraints? 





Chapter 8 

Genetic Optimisation 


In Section 7 we discussed optimisation methods based on calculus techniques, showing that 
although effective, the methods were based on the assumption that problems were quite 
close to being linear. In contrast, in this section we describe a modern computer - based 
method which makes no use of calculus. There are obvious reasons why calculus-based 
methods may be inadequate: 

• the function may naturally have sharp corners; 

• the function may be ill conditioned; 

• the function may have a very large number of critical points; or 

• the function may simply be given numerically, rather than with an explicit formula, 
so all differentiation has to be done that way as well. 

There is no single method to be described here, but rather a family of methods, called 
genetic algorithms which seek to evolve solutions to optimisation problems using similar 
sorts of “survival of the fittest” methods which occur in nature. Further details and many 
references can be found in Goldberg (1989), which gives a relatively easy introduction to 
the subject, or the more recent Mitchell (1996). A slightly less discursive treatment, with 
more emphasis on mathematical optimisation problems is in Michalewicz (1996) 

8.1 A Simple Algorithm 

We begin this study by applying the method to a very simple problem; that of maximising 
the function x 1 over a closed interval. The choice of such a trivial problem means we can 
concentrate on the methodology, while being able to see how the algorithm has progressed 
towards the optimum at any time. 

8.1.1 Coding the Domain 

We shah be concerned both with the values of an integer k and also with its representation. 
Note that 15 and 16 are adjacent values, but have very different (binary) representations 
as is clear from the sample in Table 8.1. This difference is sometimes called a “Hamming 
cliff” to describe the sharp discontinuity. 


87 


CHAPTER 8. GENETIC OPTIMISATION 


In much of what we do, this can cause problems. However, there is a simple solution, 
known as Gray coding which avoids the problem, by ensuring that adjacent numbers 
have “binary” codes which differ by at most one bit. 1 If in addition, we require Gray coded 
integers to have the same code length as their binary values, and that variation occurs from 
the “bottom up” when there is a choice, this is enough to define the code by induction. 
Table 8.2 gives the first few values. 


n 

Code 

n 

Code 

n 

Code 

n 

Code 

0 

00000 

8 

01000 

0 

00000 

8 

01100 

1 

00001 

9 

01001 

1 

00001 

9 

01101 

2 

00010 

10 

01010 

2 

00011 

10 

01111 

3 

00011 

11 

01011 

3 

00010 

11 

OHIO 

4 

00100 

12 

01100 

4 

00110 

12 

01010 

5 

00101 

13 

01101 

5 

00111 

13 

01011 

6 

00110 

14 

OHIO 

6 

00101 

14 

01001 

7 

00111 

15 

01111 

7 

00100 

15 

01000 


Table 8 . 1 : Binary codes. Table 8 . 2 : Gray codes. 


We can now state the form of optimisation problem we shall deal with. We assume 
given a function / = f(k \,... , k p ) of p parameters or variables, each of which has a finite 
set of possible values. We assume in addition that / is normalised, perhaps by adding a 
suitable constant, so that f(k \,... , k p ) > 0 . Our problem is then: 

maximise / subject to these restrictions. (GA) 

The restriction to a finite number of parameters, each of which can take a finite set of values 
means we can encode a possible solution of such a problem as a string of bits, of length n, 
say. For convenience we assume conversely that all such strings do give valid members of 
the domain of /. 

This is a considerable simplification, since at least in this case we are restricting to an 
unconstrained maximisation problem where standard techniques such as gradient ascent are 
available. We have already seen that constraints seem almost inevitable in real applications 
yet in general increase the difficulty in obtaining an optimum value very significantly. We 
show in Section 8.4 that a genetic algorithm can incorporate constraints with less difficulty 
than some other methods. 

With this assumption, the set of all possible solutions thus has size 2 n , and a typical 
solution is then 

s = (0,1,1,1,0,0,1,0,1,0, 0,0,1,1,1,0,1,1, 0,0 ,... ,1,0) € {0,l} n . 

Note that this string is an encoding of the particular choice of parameter values. We make 
no assumptions about what type of parameters they are, or how the encoding is done. If 
each parameter is a number, one way to code is first to create suitable binary (or Gray) 
codes for each parameter, and then concatenate the resulting strings to get a single string. 

1 Recall that the word “bit” is an abbreviation for “binary digit”; the word “digit” alone implies a scale 
of ten. 









8 . 1 . A SIMPLE ALGORITHM 


89 


We then write /(s) for the value f(k ±,... , k p ) > 0 , where s is the string which is the 
encoding of (fci,... , k p ), and call /(s) the fitness of the string s. It is natural then to say 
that si is fitter than S2 if /(si) > /(S2). Thus our problem becomes that of finding the 
fittest strings from our string universe { 0 , l} n . 

8 . 1 . Example. A black box is fitted with five switches, each of which can be either “on” or 
“off”. The output of the box (perhaps electrical potential or “voltage”) can be measured, 
and varies according to the position of the switches. Find the positions of the switches 
which maximise the output of the box. 

Solution We develop a full solution slowly, but note there is an obvious binary encoding 
of the solution space as follows: 

( 0 , 0 , 1 , 1 , 0 ) 4—(off, off, on, on, off). 


Our main example is a simple mathematical one which presents no difficulties. Let 
f(x) = 1024 x 2 . We wish to maximise / on the domain [ 0 , 1 ), and work in units of 1 / 32 , so 
in practice we seek the maximum of the finite set 

/(°)>/ (32) 'f (32) (32) ' 

Of course / is monotone, so the maximum is 31 2 = 961 ; we derive thus result using a 
genetic algorithm. 

8.1.2 Genetic algorithms: process model 

We first describe the model which motivates the detailed description of a genetic algorithm. 
In the model: 

• each string is a gene; 

• natural selection favours fit strings; 

• strings can be combined to produce new strings; 

• there is a rare random process called mutation which alters strings locally. 

We illustrate the process using our example and pass through a number of stages 

Gene Pool We first need to choose a population of strings, the gene pool, from which 
the solution is to evolve. The assumption here is that it is not practicable to allow a gene 
pool consisting of the universe { 0 , 1 }”, otherwise the method simply reduces to that of 
“exhaustive search”. For illustration, we choose a gene pool with population size P = 4 
strings; in general the size of the gene pool is one of a number of choices that has to be 
made by the designer of the algorithm. Our gene pool in this case is chosen “at random”, 
and the fitness of each string or gene is evaluated to give Table 8 . 3 . 



90 


CHAPTER 8 . GENETIC OPTIMISATION 


s 

32 x 

f(x) 

01101 

13 

169 

11000 

24 

576 

01000 

8 

64 

10011 

19 

361 


Table 8 . 3 : Initial Gene Pool. 


Natural Selection We now allow the fittest strings to survive and enter the mating pool 
to produce a new generation. There are many ways of doing this, and at the same time 
keeping the population size P constant. One natural way is to select is to choose strings 
at random with a probability proportional to the fitness of the string. In general, this will 
mean that some strings are selected more than once, while others, usually the least fit, drop 
out of the gene pool. In our example we assume that such a selection process included the 
fittest string twice, and allowed the least fit string to disappear, giving the mating pool of 
Table 8.4 


s 

32 x 

f(x) 

01101 

13 

169 

11000 

24 

576 

11000 

24 

576 

10011 

19 

361 


Table 8 . 4 : Mating Pool. 


Mating We assume for simplicity that each pair of parent strings produce exactly two 
offspring. 2 The genes of the offspring are obtained by mixing parental genes in an operation 
know as crossover. This is illustrated in Fig. 8.1 The number of cross-overs, and the sites 
of each cross-over are chosen at random; in the example, there are two cross-over sites. The 
offspring are shown in Fig. 8.2. 


0 10 0 1110 



Figure 8 . 1 : Two strings with two cross¬ 
over sites. 


0 10 110 0 0 

g -e-e-e-e e-e-© 

g -e-e-e-e e-e-e 

0 0 10 1111 

Figure 8 . 2 : Offspring: the result of the 
mating shown in Fig 8 . 1 . 


The effect of mating on our example population, with a single crossover site after the 
first three genes, is given in Table 8 . 5 , and shows that following this evolutionary step, the 
average fitness has increased from 292.5 to 439 . 

2 Two offstrings? 











8.2. VARIATIONS 


91 


s 

32x 

/(*) 

01100 

12 

144 

11001 

25 

625 

11011 

27 

729 

10000 

16 

256 


Table 8.5: After mating: this population forms the next generation, and has average fitness 
439. 


We now discard the strings from the initial population, replacing them with their “chil¬ 
dren” , and allow the process to be iterated, selection and mating produce further genera¬ 
tions, whose fitness we hope continues to increase. 

Mutation Our initial population was a “good” one in that we can find individuals for 
which any individual gene is either “on” or “off”. We could describe this as having genetic 
diversity . However the evolutionary process will select against particular genes and allow 
them to “die out”. An example of this is in our first generation population above; it is likely 
that the first individual in this population, with fitness 144 will not be selected for mating, 
and hence the first “0” in the gene pool will be eliminated. Since a “1” here increases the 
value of x by 256, it contributes strongly to the fitness of the string, and we see here the 
process working, with a “bad” gene being selected against. 

As we have described the process so far, no future individual will have the top bit 
missing. However there is a disadvantage in reducing genetic diversity, as can be seen with 
the same individual, who also carries the only example of “1” in position 3, which we know 
to be beneficial in this case. Thus at the point when this gene is eliminated, there is no 
possibility of evolving to the optimal solution. 

To counteract this effect a process of mutation is introduced in which, after mating, 
each bit in the genetic material of the whole of the new population has a very small prob¬ 
ability of changing from a “0” to a “1” or vice-versa. This allows new genetic material to 
be examined; the idea is that if the new gene is “better”, it will quickly be adopted by the 
rest of the population. 


8.2 Variations 

Almost none of the detail that has just been described is essential in order to produce 
a genetic algorithm capable of worthwhile results. For example there are many selection 
strategies to give the new generation; here are some. 

• Discard the least fit 50% of the population and allow each remaining string to breed 
twice 

• Assume string s* has fitness /*. A string is then selected for breeding with probability 
fi/F , where F = Y fii with the full breeding population selected using n independent 
trials (roulette wheel breeding). Thus the possibility of a string mating with itself, 
when crossover has no effect, is not excluded. 



92 


CHAPTER 8. GENETIC OPTIMISATION 


• As above, but first introduce the fittest string (or the fittest 5% of strings to the 
breeding pool. 

• As in roulette wheel breeding, but the fittest string is selected to mate with the 
remaining fit strings. 

It is clear that many more strategies both for mating and selection can be described, 
each tailored to overcome difficulties with the naive behaviour. Here are two examples of 
the type of problem that can occur 

Dominance: it can happen, usually early in a run, that one string is very much fitter than 
the others; “ a few extraordinary individuals in a population of mediocres.” 

Mediocrity: there can be little difference in fitness between the average and the best 
individuals, and the algorithm becomes “a random walk among the mediocre”. 

A solution to both of these problems is to rescale the fitness function; assuming that 
/ > 0, let /i = af + 6, where a and b are chosen so that f\ > 0 and so that the fittest 
individual is twice as fit as the average. 

8.2. Example. Use a genetic algorithm to solve the problem: 

maximise f(x) = 256 [^(x — ^) 2 ] on [0,1) working on a grid of size 1/32. 

Solution This is relatively straightforward with the correct choice of coding, but a binary 
coding is not sensible; note that although the maximum occurs when x = 16/32 the string 
corresponding to x = 15/32 is also very fit, yet its binary representation differs in every 
position from that of the fittest string. Evolution from x = 15/32 to x = 16/32 is thus very 
unlikely in this coding. 

The resolution in this example is to change the coding scheme; this is an example in 
which the benefits of Gray coding show clearly. However the underlying problem should be 
noted. We return to it in Section 8.3. 

8.3. Example. Consider our original problem of maximising f(x) = 1024x 2 , but now sup¬ 
pose the formula is modified so that /(0) = 1024. 

Solution Such an objective function almost necessarily defeats any advantage the genetic 
algorithm has. Although the maximum occurs at 0, there is no evidence near 0 that this 
string is in any way good. Unless the optimal solution is first found by chance, and then 
is deliberately retained, it will disappear. There is clearly no advantage here in using a 
genetic algorithm. Of course we can still hope for a “near optimal” solution as is likely to 
be obtained in this case. 


8.3 Further Discussion 

Clearly the idea of “evolving” a solution to a difficult problem under competitive pressure 
is theoretically attractive. Having discussed how a genetic algorithm can achieve this in 
principle, it it time to investigate whether the promise is fulfilled in practice Much of this 
section is taken from Mitchell (1996), which should be consulted for more details. 


8.3. FURTHER DISCUSSION 


93 


Our first example, of f(x) = 1024x 2 was easy to do in other ways. It is simple to check 
that the algorithm given does work, and it is easy to see why that is. However it is much less 
clear that an advantage accrues to the genetic methodology, as opposed to something less 
organised. We start by given a number of alternative possible search strategies which could 
be used on the type of “large discrete search space” problems we have been considering. 

GA: genetic algorithm as described above. 

SAHC: steepest ascent hill-climbing: 

1. Choose a string at random, and call it the “current hilltop”; 

2 . Systematically flip each bit of the current hilltop and record the resulting fitness; 

3. if there is no string with higher fitness, save the current hilltop and go to step 
1 ; otherwise reset the highest value as “current hilltop” and go to step 2. 

NAHC: next ascent hill climbing: 

1. Choose a string at random, and call it the “current hilltop”; 

2 . Systematically flip each bit of the current hilltop until the resulting fitness in¬ 
crease; if there is no increase, return the bit to its original value. 

3. If there is no increase in fitness, save the current hilltop and go to step 1. 

4. If a string of higher fitness is found, go to step 2, but continue mutating the 
string from the position last changed. 

RMHC: random mutation hill climbing: 

1. Choose a string at random, and call it the “current hilltop”; 

2 . Choose a location at random and flip that bit of the current hilltop and record 
the resulting fitness; 

3. If there resulting string has higher fitness, reset the value of ’current hilltop”. 

4. Go to step 2. 

In order to compare searches using comparable resources, the process should stop after the 
same specified number of fitness evaluations. 

The Problem is chosen to be simple, but to give some advantage to “evolution”. The 
population consists of all binary strings of length 64. The fitness of a string is to be the 
number of 1 ’s present in “well positioned blocks”. We say that a block is “well positioned” 
if it consists of 8 consecutive l’s, and the first of them occurs in position 1, 9, 17, etc. Blocks 
of 8 consecutive l’s are not favoured in any other position. To illustrate this function, the 
fitness of a number of strings is calculated in Table 8.6. 

The results, given in Table 8.7 are not an overwhelming endorsement of the genetic 
algorithm methodology. Testing was for 256, 000 function evaluations unless the method 
had earlier converged to the maximum possible fitness of 64. Each algorithm was tested 
200 times; the mean and median number of runs to convergence is given, together with the 
standard error (cr/\/number of runs) in brackets. 



94 


CHAPTER 8 . GENETIC OPTIMISATION 


String Fitness 


00010011 

00111000 

00011111 

10001110 

00000111 

11110000 

00101011 

00011100 

0 

11111111 

00000000 

00000000 

00000000 

00000000 

00000000 

00000000 

00000000 

8 

01111111 

10000000 

00000000 

00000000 

00000000 

00000000 

00000000 

00000000 

0 

11111111 

11100000 

11111111 

00000000 

00000000 

00000000 

00000000 

00000000 

16 

11111111 

00111111 

10000111 

11111110 

00000000 

00000000 

11111111 

11111111 

24 


Table 8.6: A fitness function designed to test the propagation of “well positioned” blocks 
of 8 copies of 1 . Only blocks whose positions reflect the underlying structure of the string 
are rewarded. 


200 runs 

GA 

SAHC 

NAHC 

RMHC 

Mean 

61334 ( 2304 ) 

> 256000 ( 0 ) 

> 256000 ( 0 ) 

6197 ( 186 ) 

Median 

54208 

> 256000 

> 256000 

5775 


Table 8 . 7 : Number of function evaluations needed to find the fittest string using different 
optimising algorithms 


This problem was specifically designed to “assist” the methodology of GA’s and the 
testing (Mitchell 1996 , pagel 30 ) was certainly not by an opponent of the method. It is thus 
clear that they are not an automatic choice, even if the problem appears appropriate. 

8.4 The Travelling Salesman Problem 

We now discuss a classic optimisation problem in which the constraints form an essential 
part. 


A salesman has to visit n cities and return to his city of origin. Each city has 
to be visited exactly once, and the distance ( or more generally the cost) of the 
journey between each pair of cities is known. The problem is to do the tour at 
minimum total cost. 

This forms one of a class of problems known as NP-complete problems which are believed 
to require a computation time exp (kn), where A: is a constant of the problem, for an exact 
solution. The “standard” method of obtaining an approximate solution uses an optimisation 
method known as simulated annealing which we discuss in Chapter 9 . In this chapter we 
show how it can be formulated as a linear programming problem, and as an example in 
which genetic optimisation is possible. Although presented in this simple form, there are 
applications for example to circuit board design equivalent to a tour of 17,000 cities. 

8.4.1 Linear Programming Formulation 

We set up the problem by numbering each of the n cities as C\,... , C n ] a tour is then just a 
permutation, say (£q, /c2, &3, ■ ■ •, k n ) of ( 1 ,..., n). We interpret this as the tour which starts 
at Cfcj, then goes to Ck 2 and so on, running through the remaining cities up to and including 
Ck n before finally completing the tour back at C ^ 1 . It is thus clear than we can permute 











8.4. THE TRAVELLING SALESMAN PROBLEM 


95 


the permutation cyclically without changing the tour. There is no loss of generality then 
in assuming that Cjj = Cj; it is then clear that there are (n — 1 )! different tours . 3 This 
also shows that solving the problem by exhaustive enumeration is infeasible unless there 
are very few cities. 

The problem can be formulated as a linear programming problem, as we now do. In¬ 
troduce (integer) variables x^ where Xij = 1 when Cj is the city immediately following C* 
in the tour, and = 0 otherwise. Then since exactly one city precedes Cj in the tour we 
have 


n 

x^ = 1 , for each j, 

1=1 

while since the tour visits city Ci exactly once, exactly one city must follow Cj and so 

n 

Y^Xij = 1 , for each i. 

3 = 1 

Given that the distance (or perhaps the cost) travelling from Ci to Cj is c % j. the Travelling 
Salesman Problem then involves minimising the total cost 

n n 

y! y °ij x ij 

i +1 j+1 

subject to the above constraints, the usual reality constraints that > 0 and the fact that 
Xy are in fact either 0 or 1 . 

However this does not capture the problem exactly; as it stands a pair of disjoint 
subtours will be acceptable. To cope with this we introduce n — 1 new variables integral 
U2, U3,... ,u n with m > 0 and (n — l ) 2 — (n — 1 ) new constraints as follows: 

Ui — Uj + nxij < n — 1 for i,j = 2 , 3 ,...n and i j. ( 8 - 1 ) 

8.4. Proposition. Any tour which satisfies all the above constraints does not split into 
subtours. 

Proof. We assume without loss of generality that we have a tour starting at Ci; assume 
there is a subtour, so the tour starting at Cj returns to Cj before visiting all the cities. 
Then there must be another subtour, which does not visit Cj and which includes r < {n— 1 ) 
cities. 

We add up the constraints 8.1 for each of the r non-zero on this tour to get nr < 
(n — l)r since the itj all occur twice and so cancel. This contradictions shows there are no 
proper subtours. □ 

8.5. Proposition. Any solution of the Travelling Salesman Problem satisfies all the con¬ 
straints given above. 

3 In general we don’t identify a tour and the same tour done the opposite way around 



96 


CHAPTER 8. GENETIC OPTIMISATION 


Proof. It is enough to show we can define variables Ui which satisfy the constraints 8 . 1 . Let 
Ui be the position on the route at which Ci is visited. Thus if we have a tour of five cities, 
say 


Ci - C 3 - C 5 - C 4 - C 2 - Ci 

then U2 = 5 , = 2 and so on. 

Consider now the constraint Ui — Uj + nxij < n — 1 for fixed i and j ^ i. Note that 
Ui < n while Uj > 2 , so if Xij = 0 we have Ui — Uj + nx^ < n — 2 and our inequality holds. 

In the remaining case, Xj-j = 1 , so Cj is visited immediately after Ci and Uj = Ui + 1 . 
Thus 

Ui — Uj + nx^ = Ui — (ui + 1 ) + n = n — 1 , 

and again the constraint holds. □ 

8 .6. Remark. We have not discussed linear programming problems with integral con¬ 
straints, the requirement that the variables be integers, anywhere in these notes. Thus 
we have no way of actually solving the Travelling Salesman Problem in this formulation, 
although there are good methods; see for example Kolrnan & Beck ( 1995 , Chapter 4 ). A 
theorem of Hoffman and Kruskal may be of interest in this context to show that some of 
our earlier examples were legitimate applications: if the constraint matrix consists entirely 
of 0 ’s l’s and —I s, then any solution obtained using the standard simplex method will 
automatically be integral. This means that there was no problem above until the variables 
Ui were introduced, at which point there was no longer a guarantee that the simplex method 
would find variables with integer values. 

8.4.2 Genetic Algorithm Formulation 

With this conventional introduction, we consider how the Travelling Salesman Problem can 
be tackled using a genetic algorithm. This example, and much of the remaining discussion 
in this section is taken from Michalewicz ( 1996 , Chapter 10 ). 

The two essential ingredients of a genetic algorithm are the availability of a natural 
representation of the problem and its constraints, which leads to suitable genetic operations, 
and the availability of a useful fitness function. In this problem, the fitness function is clear; 
two tours can be evaluated simply by looking at the total cost of each tour; in contrast 
there is no natural representation, which describes the problem. 

Of course a naive binary representation of each tour is available, in which each city is 
numbered, and a gene is simply an ordered list of city numbers. This representation does 
not however lend itself to useful genetic operations; a naive mating may not even produce a 
tour (some city numbers may be invalid), and is unlikely to be a valid tour with no repeat 
cities. 

We are thus led to seek alternative (non-binary) representations which are adapted to 
this particular problem. Here are some possible representations. 

• the ordinal representation, in which the tour ( 1 , 2 , 4 , 3 , 8 , 5 , 9 , 6 , 7 ) is represented as 
the list ( 1 , 1 , 2 , 1 , 4 , 1 , 3 , 1 , 1 ) in which the code refers to the position of the next city 
of the list of unused cities. 


8 . 5 . EXAMPLE PROBLEMS 


97 


• the path representation, which was the natural one we used above; and 

• the adjacency representations in which the tour ( 1 , 2 , 4 , 3 , 8 , 5 , 9 , 6 , 7 ) is given in the 
form ( 2 , 4 , 8 , 3 , 9 , 7 , 1 , 5 , 6 ) 

Even when cross-over is defined, it does not lead to natural “genetic evolution” and so it is 
necessary to have much more specialised crossover operators, in which the naive cross-over 
is “repaired”. More details of suitable genetic operators can be found in Michalewicz ( 1996 , 
Chapter 10 ) 

These ideas suggest that local deterministic improvements can also be used during a 
generation — perhaps the equivalent of the influence of the environment on the individual, 
so that the assessed fitness of the individual is not the same as its fitness at birth. In 
practice such mimetic algorithms give better results than a purely evolutionary strategy. 
One way to describe this is that a genetic algorithm can “get to the right hill” but then 
local hill climbing can improve on the solution. 


8.5 Example Problems 

We describe a number of “real” problems, both mathematical and non mathematical which 
genetic algorithms proved able to solve successfully. 

8.5.1 Numerical Optimisation 

Our first example is a “straight” numerical optimisation problem. For example, consider 
the problem 

“ / x . \ 

8 . 7 . Example. Minimise fix ) = — > x ? - ( c,- + In---) subject to 

“ V X 1 H- 1-£10 J 

£i + 2 x 2 + 2 X 3 + x 6 + Xio = 2 , 

£4 + 2 x 5 + x 6 + £7 = 1, 

£3 + £7 + £8 + 2 xg + £10 = 1 , 

£1,... ,£10 > 0 . 000001 . 

where the constants Cj are given by 

ci = - 6 . 089 ; c 2 = - 17 . 164 ; c 3 = - 34 . 054 ; c 4 = - 5 . 914 ; c 5 = - 24 . 721 ; 

c 6 = - 14 . 986 ; c 7 = - 24 . 100 ; c 8 = - 10 . 708 ; c 9 = - 26.662 ci 0 = - 22 . 179 . 

The minimum is not known, but a genetic algorithm did better than the previously 
known non-genetic minimum, which was obtained by a conventional rather than a genetic 
algorithm. For details of a number of examples like this, see Michalewicz ( 1996 , Chapter 
7 ); this is their “test case number 2 ”. 

8.5.2 Credit Card Scoring 

Here the aim was to improve on an existing credit card scoring mechanism. Using the 
existing constraints of their system, but evolving the weights to be placed on various answers 



98 


CHAPTER 8. GENETIC OPTIMISATION 


gave a system which was better than the existing one, reducing the loss by 1 / 2 %. In fact it 
turned out that not all the existing constraints were being obeyed by the existing system; 
using the actual constraints of the existing system, with a genetic algorithm led to an 
improvement of 1 . 5 %; and a saving of “a high six figure sum annually.” Even this does not 
reflect the potential saving, since the work was done with historical, and hence “censored” 
data; it was not possible to look at the subsequent credit history of someone the algorithm 
scored well but the existing system rejected, since there was no subsequent history. 


8.5.3 Designing FGGA’s 

An FPGA — a Field Programmable Gate Array — is a fairly amorphous sort of computer 
chip. It has much potential, but no character until that has been designed in. Specifically 
it has no internal logic of its own, until the connections of each particular gate in the chip 
have been set up, which can be done after manufacture; in other words, in the field. This 
contrasts eg with a Pentium etc which is designed with a specific function in mind. 

A field programmable gate array (FPGA) has a finite number of internal switches, 
each of which can be either “on” or “off”. It has an output, taken as a real number, 
which depends on a fixed input, and these switch settings. Programming the chip involves 
choosing suitable settings for these internal switches. A program can be tested by setting 
the switches and measuring the output from the fixed input; a good program is one in 
which the measured output is close to a known “ideal” output. 

Given the problem, in which we have to choose a binary state of each of a finite number 
of gates, it is clear how we can encode a program as a binary string. We are given a real 
number x, say, which measures the difference between the output from the program and 
the desired output; we take the fitness of the corresponding string to be \/x\ strings with 
a greater fitness represent better attempts at the required program. 

In case the example seems fanciful, I saw it running in Sussex University Innovation 
Centre in October 1997 ; it was also featured in recent EPSRC publicity. The aim was to 
do “learn’ how to do certain types of speech recognition. 


8.5.4 The Cocktail Party Effect 

This is another example which I found in a recent “New Scientist’ (Ngo & Bhadkamkar 1998 ) 
in which a genetic algorithm is being used to do noise cancelling so that “clean” signals are 
presented to a speech recogniser. The aim is to imitate the voice of one of two speakers, 
when only the mixed signal is available. Once this is done, the voice of one speaker is 
subtracted from the mixed signal thus giving the voice of the other speaker. Each can then 
be processed separately. 

The assumption that makes this all work is that the two speakers produce uncorrelated 
signals; the genetic algorithm then evolves a sound wave so that the it is uncorrelated with 
the difference of original signal and itself. 

Cocktail Parties? The ability of the human ear to pick out a single speaker in a room 
full of speakers is known as the “cocktail party effect!” 

You can find more information at http://web.interval.com/papers/ 1997-062 


8 . 5 . EXAMPLE PROBLEMS 


99 


8.5.5 Mondrians 

This was an image processing exercise, the aim was to locate grey rectangles on a (slightly 
different) grey background, in the presence of noise in a square image of size 512 x 512 . A 
potential solution could be described by given a small number of parameters representing 
the grey-scale value (between 0 and 255 ) of each rectangle, the location of two opposite 
corners, and its precedence (from front to back say, with higher precedence rectangles 
appearing nearer the front, and so less likely to be obscured). The “background” is then 
coded as a full-sized rectangle with the lowest precedence. The fitness function was then 
derived from how well such a model of the image fitted the (synthetically created) data. 

In practice a genetic algorithm, with the “obvious” coding except for the use of Gray, 
rather than binary codes, proved better than many more elaborate methods based on 
following edges and guessing intersections. 

8.5.6 Prisoners Dilemma 

This is a classic example of a non-zero sum game and illustrates the conflicts which can arise 
between the principles of competition and co-operation. The payoff matrix from Rowman’s 
viewpoint is shown in Table 8.8. 



Columnman 

Columnman 


remains silent 

defects 

Rowrnan remains silent 

3,3 

0,5 

Rowrnan defects 

5,0 

1,1 


Table 8.8: Payoff matrix for the Prisoner’s dilemma; the return to Rowrnan is given first, 
then the return to Columnman 

The story behind this which gives the problem its name is that Rowrnan and Columnman 
are both prisoners, who were involved in the same enterprise before being put in jail, but 
are now unable to communicate with each other. Each has been invited to give evidence 
against the other; the payoff matrix gives the number of years by which their sentence 
will be reduced given varying outcomes. If one of the prisoners defects and gives evidence 
against the other, he is rewarded; but only if the other does not make the same choice. 
Thus the total return to both prisoners is greatest (at 6 years reduction in total sentence) 
if each remains silent, while the strategy that minimises possible loss whatever the other 
player does, results in the least total return to both prisoners of 2 years remission. 

The problem becomes one of finding the best strategy in a series of games against the 
same opponent. Theoretical analysis is difficult and in work by Axelrod (Mitchell 1996 , 
Page 30 ) a number of programs were pitted against each other in tournaments, using many 
different algorithms. Perhaps surprisingly, a very simple algorithm proved very effective, 
namely: 

first co-operate (ie remain silent) and then “tit for tat”. 

The effect is to co-operate as soon as the other program does so. It turns out that genetic 
algorithms were able to discover this strategy; and indeed even improve on it slightly, by 
initially defecting, and then “apologising”! 






100 


CHAPTER 8. GENETIC OPTIMISATION 


8.6 Multiple Objective Functions 

One final point should be made, which is valid for any optimisation method. What happens 
when, as is usual in real life, there is more than one objective function to be optimised, 
or alternatively, if the constraints which are required cannot be met? Consider first the 
problem 


maximise /i(x) and / 2 (x) subject to 5i(x) = 0 for i < i < n. 

Of course in simple cases, maximising f\ may also maximise /2, but the usual situation 
would involve a compromise. It is necessary to weigh together the two functions and decide 
how they should be balanced. One solution, indicating a more general approach, simply 
builds a single objective function from the given ones. Thus the problem might be rewritten 
as 


maximise /f (x) + /|(x) subject to <?i(x) = 0 for % < i < n; 

in exchange for a more complicated objective function, now necessarily quadratic, there 
is only one thing to optimise. This quadratic objective function is often thought of as 
a “distance” from the true solution. It is still necessary to see that the problem scales 
correctly, so that, for example a very large departure from optimising f\ is appropriately 
reflected in the behaviour of the distance or “penalty” function, and this may introduce 
further non-linearities. Note also that even if the original functions and constraints were 
all linear, we move away from this domain by introducing the penalty function. 

As alternative way to employ this idea is when the constraints are inconsistent. Consider 
the problem 


maximise /(x) subject to g*(x) = 0 for i < i < n, 

and suppose it turns out that the constraints over determine the problem and there is 
no solution. We can allow the constraints to be relaxed by minimising a single objective 
function of the form 

T(x) = -/(x) + Xi(gi (x) 2 + A 2 52 (x) 2 4-b A„ff n (x) 2 . 

Again we necessarily pass to a quadratic function, and again we have to ensure that the 
scaling parameters Ai,..., X n are sensibly chosen. This function does not even reproduce 
the original result if the constraints can always be satisfied, although such a failure probably 
indicates an inappropriate choice of the A’s. A familiar example of this approach is the least 
square approximation to a straight line through many points; one we met earlier is the the 
“big M” method of Section 4 . 3 . 1 . 


Chapter 9 


Simulated Annealing 


9.1 Introduction 

Optimisation problems occur frequently in many disciplines, often as a subsidiary task 
before the interesting problem. As an example assume we wish to investigate the bending 
of a metal sheet in order to touch (spot weld at?) a collection of non coplanar point. The 
first step is to fit the best plane to the given points, since this is the bit that can be done 
with no bending, and then look at the bending necessary from this equilibrium situation in 
order to achieve the required result. 

Standard calculus methods will work here. We specify the plane with three parameters, 
and then use the normal equations to calculate the best fit. Recall that we derived normal 
equations by looking for critical points of the fitting error, regarded as a function of the 
parameters defining the plane. The problem is almost always well behaved, and we quickly 
come up with a solution before getting on with the real work. 

More generally, iterative methods have traditionally been calculus-based. The conjug¬ 
ate gradient method computes not just the value of the function at a given point, but 
also the value of the derivative there. In several dimensions, this then gives information 
about which way to move in order to improve (decrease if we are minimising) the value of 
the function. One “obvious”, but not very good way is the method of steepest descent 
in which we minimise down the line of maximum gradient, move to that point and repeat. 
A good overview of many such methods is in Press et al. ( 1992 , Chapter 10 ); in particular 
it describes the problems with the steepest descent methods (Press et al. 1992 , Page 421 ) 
in a long narrow valley. 

All such traditional optimisation methods have a problem in the presence of functions 
which have many local minima in the area of concern. One such “difficult” example, even 
in one dimension, is shown in Fig. 9 . 1 . It isn’t even artificial; the function is f(x ) = 
(2 + sin(7rx)) cos( 207 tx), and naturally crops up when studying amplitude modulation; / is 
“really” the rapidly varying function cos( 207 tx), but the amplitude has been modulated by 
the slowly varying function 2 + sin(7rx). If this looks the sort of function you can easily 
deal with, how about Fig 9 . 2 , where at coarse scale there appears to be a simple minimum, 
while at the scale drawn, it is clear there is a small fluctuation which confuses the overall 
picture. The function is (1 + cos( 307 tx)/ 15 ) cos(-7rx); almost the same as cos(-7ra;) but with 
the additional “waviness” cos( 307 rx )/15 which causes the trouble. 

This chapter complements the work on genetic algorithms in Chapter 8 and provides a 


101 


102 


CHAPTER 9. SIMULATED ANNEALING 



Figure 9.1: Locating a global maximum can 
be hard. 



Figure 9.2: An apparently smooth min¬ 
imum may be like this? 


second example of what could be considered a modern optimisation method. Such methods 
take advantage of the computing power now easily available on the desk, and go beyond 
traditional techniques which require significant mathematical analysis to understand the 
situation before starting a relatively circumscribed numerical process. 

9.1.1 Overview 

The process of annealing is one in which a solid, usually metal, is first heated, and 
then allowed to cool slowly. As the solid cools, a change of state takes place in which 
individual atoms arrange themselves into a regular array corresponding to a minimum 
energy arrangement. Such an arrangement cannot easily propagate throughout the solid if 
the cooling occurs quickly, and boundaries between different “domains of regularity” occur. 
Such boundaries introduce potential “fault-lines” along which a fracture is most likely to 
occur when the material is stressed. To avoid such potential failures, metal is often cooled 
slowly, in a process known as annealing to permit re-arrangements at these boundaries so 
the same local minimum energy arrangement occurs throughout the material. 

This process is imitated in numerical optimisation. The idea originated with Metropolis, 
Rosenbluth, Rosenbluth, Teller & Teller (1953) when trying to simulate such thermody¬ 
namic systems. Given a potential state change from one with energy E\ to one with energy 
E 2 , they chose to accept it with a probability 

Prob( Accept) = min(l, exp{— (E 2 — E\)/kT}) 

where T is the “temperature” and k is a constant — in this application it is Boltzmann’s 
constant. In words this always accepts a change if it moves to a state of lower energy, 
but sometimes accepts the change even though the system moves to a state with a higher 
energy. Note that for small T there is a very small probability of accepting an unfavourable 
move, while for large T, the probability of acceptance can be quite high. 

With this in mind we now describe the requirements in order to apply the same ideas 
to a more general minimisation problem. We need 

• a coding of the possible system states; 

• an objective function which we are trying to minimise; 





























9.2. AN EXAMPLE - THE TRAVELLING SALESMAN PROBLEM 103 

• a mechanism for proposing random changes to the state of the system; and 

• A control parameter T, analogous to the temperature above, which governs the prob¬ 
ability of acceptance of the proposed change, together with an annealing schedule 
specifying how the temperature is to be lowered. 

9.2 An Example - the Travelling Salesman Problem 

We illustrate these ideas by applying them to the travelling salesman problem described in 
Section 8.4. This example is given in Press et al. (1992, Section 10.9). 

Coding 

There is no need for a carefully tailored coding scheme because we aren’t going to combine 
bits of it “at random” as we did with the genetic algorithm. Instead, we can simply use 
the obvious path representation described in Section 8.4.2 on page 96. Thus each potential 
tour can be described by a re-arrangement i of the city labels. 

Re-arrangements 

We try to generate “intelligent” changes, and indeed this is at the heart of the method. 
Here are two possibilities: 

• a section of the tour is cut out and replaced by one running the other way round; or 

• the section is replaced by a random re-arrangement of that section. 

Objective Function 

In its simplest form, this is straightforward. In a “real” problem we know the distances 
between each pair of cities, so we simply minimise the objective function 

/(i) = Y V(xi ~ Xi~i ) 2 + (yi - yi-i) 2 , 

i 

where (xj, yi) is the position of the i th city. More generally the “distances” can be travelling 
costs etc. It is possible to manipulate the answer eg by adding an additional cost to go 
between different “regions”; this for example could bias keeping the tour in Scotland until 
all Scottish cities have been done, before moving to England etc. 

Annealing Schedule 

Here I am very dependent on the experience of others. But this is an outline of how you 
might start. The first need is to get the scale of the problem correct - try some typical 
solutions to find out how the objective function varies, and use this to choose the parameter 
T so that initially it will allow such changes with high probability. Then decrease T by 
(say) 10% each time as part of the cooling schedule, and run the algorithm until either 


• a total of 1001V reconfigurations have occurred; or 



104 


CHAPTER 9. SIMULATED ANNEALING 


• a total of 10iV successful reconfigurations have occurred. 

Next, decrease the temperature by 10% and repeat the calculations. Continue in this way 
until there is so little change that the system seems to have converged. 


9.3 Minimising a Function 

Our running example is that of minimising an objective function /(x); we conclude by 
seeing how the necessary ingredients are obtained. Of course the objective function / is 
clear, while the control parameter T and the cooling schedule have to be set by experiment 
as before. The state of the system is given by the variable x, so to run the algorithm we 
simple have to describe how to move from a potential solution x to a new one x + Sx. 
One such method is as follows; calculate the gradient at x, and move downhill a distance 
which depends on the value of the gradient, so small moves are made if the function appears 
nearly flat. In fact there are many things wrong with our basic “change of state” algorithm; 
the problems are described by (Press et al. 1992, Page 451). You should go there for more 
information. 

9.4 Comparison 

Finally we compare this method of solution with that using the genetic algorithm. Unlike 
many other “intelligent” algorithms, simulated annealing is not greedy in the sense that 
it always accepts the best local move. It also seems capable of making decisions in a 
logical order, first getting the gross features of the problem (ie at high temperature) and 
only subsequently refining them. In contrast, the genetic algorithm seems much more like 
random search; in fact one is totally dependent on the coding being closely connected with 
the essence of the problem. However a genetic algorithm is almost trivial to program and 
so is capable of producing results quite quickly. 

Either can work. My own guess is that if you know enough about the problem to 
try simulated annealing, it is likely to provide a better solution, but I find the results of 
applying a genetic algorithm to the travelling salesman problem are impressive. 


9.5 Questions 7 (Hints and solutions start on page 127.) 

9.1. Q. Describe the various components of a simple genetic algorithm and explain how such an 
algorthim can be used to maximise a function. Illustrate your answer by considering the problem: 

maximise {n 2 \ n is an integer with 0 < n < 31}. 

9.2. Q. A field programmable gate array (FPGA) has a finite number of internal switches, each of 
which can be either “on” or “off”. It has an output, taken as a real number, which depends on a 
fixed input, and these switch settings. Programming the chip involves choosing suitable settings for 
these internal switches. A program can be tested by setting the switches and measuring the output 
from the fixed input; a good program is one in which the measured output is close to a known 
“ideal” output. 

Explain how a genetic algorithm could be used to evolve a good program for an FPGA, indicating 
briefly the processes involved. 


9.5. QUESTIONS 7 


105 


9.3. Q. Write a comparison of linear and non-linear optimisation methods. You should give a brief 
indication of typical methodologies in each case (but for just one non-linear method), and describe 
the type of problem for which each method is suitable. Comment on performance issues in general, 
and indicate the performance you would expect from each method, if both methods could be applied 
to a particular problem. 

9.4-. Q. Describe briefly methods you might use for a non-linear optimisation problem, giving an 
indication of typical methodologies both for problems which are “nearly” linear and more general 
ones. Comment on performance issues in general, and indicate the performance you would expect 
from each method, if both methods could be applied to a particular problem. 


9.5. Q. a) Describe briefly the Travelling Salesman Problem. Illustrate your description by ex¬ 
plicitly finding the shortest tour given the distances (or costs) between cities shown in Table 9.1. 


To 

From 

A 

B 

C 

D 

A 

- 

4 

6 

12 

B 

3 

- 

6 

8 

C 

7 

10 

- 

10 

D 

11 

7 

9 

- 


Table 9.1: Distances (or costs) between cities. 


b) Describe briefly how a solution to the Travelling Salesman Problem might be obtained using 
a genetic algorithm, indicating how features of the problem map to the elements needed to use a 
genetic algorithm. 

c) Describe briefly how a solution to the Travelling Salesman Problem might be obtained using 
simulated annealing, again indicating how you would obtain the necessary features for this problem. 




106 


CHAPTER 9. 


SIMULATED ANNEALING 


Appendix A 

Solutions to Exercises 


Solutions for Questions 1 (page 10 ). 

Solution 1.1: 

a) Smaller rolls can be created using a number of methods. In method Mi, the roll is cut into 
two equal 9 ft. rolls with no waste. In method M 2 the roll is cut into a 9 ft. and a 7 ft. roll with 2 
ft. of waste. In method M 3 the roll is cut into a 9 ft. and a 5 ft. roll with 4 ft. of waste. In method 
M 4 the roll is cut into two 7 ft. rolls with 4 ft. of waste. In method M 5 the roll is cut into a 7 ft. 
and two 5 ft. rolls with 1 ft. of waste. Finally in method Mg the roll is cut into three 5 ft. rolls with 
3 ft. of waste. 

b) Let Xi be the number of large rolls that are cut using method Mj. Then there are 2 X 1 +X 2 +X 3 
rolls available that are 9 ft long, X 2 + 2 x 4 + *5 rolls available that are 7 ft long and X 3 + 2 x 5 + 3xe 
rolls available that are 5 ft long. Thus the linear programming problem becomes: 

minimise X\ + X 2 + X 3 + X 4 + X 5 + x$, subject to Xi > 0 (1 < i < 6 ) and 


2xi + x 2 + x 3 > 10, 

X2 + 2x4 + X5 > 20 , 

X 3 + 2 x 5 + 3*6 > 50. 

There are at least two problems that could occur in practice. One is the possibility that Xi 
may not be integers; clearly it would be necessary to use the next largest integer number of rolls. 
The second is the difficulty in making the cut using method Mi, which involves no waste at all; in 
practice the resulting rolls may be too short. It may be that all three constraints above are tight, 
so we create exactly the right the number of smaller rolls of each size; if not some account should 
be taken of that waste. 

c) There are a total of X\ + 2(x2 + X3 + X4) + 3(x5 + Xq) cuts to be made, while the amount 
of waste in feet is 2x2 + 4(x3 + X4) + X5 + 3 x 6 - Thus the total net cost N in pounds is 

IV = (x! + X2 + X 3 + X4 + x 5 + x e )P 

+ (xi + 2x2 T 2 x 3 T 2 x 4 T 3xs T 3xg)C 

- (2x 2 + 4x 3 + 4x4 + x 5 + 3 x e )k. 

and this gives the objective function to minimise subject to the same constraints as above. 


107 


108 


APPENDIX A. SOLUTIONS TO EXERCISES 


Solution 1.2: 

a) Let the factory produce b copies of the “Bashful” sculpture, d copies of the “Dozy” sculpture 
and h copies of the “Happy” sculpture each week. The “reality” condition insists that b > 0, d > 0 
and h > 0 — and to be really formal, that each of b , d and h is integral. The total profit P in £’s 
is given by 


P = 26+4d+36. 

We have constraints based on the availability of the machines. From the given table, we will use 
2 b + d hours on machine A, 6+36 hours on machine B and 2b + 3d + 2 h hours on machine C. Thus 
our availability constraints are:- 


2b + d< 43, 
b + 3 h ^ 37, 

26+ 3d + 26 < 42. 

All the constraints are linear; hence the formulation as a linear programming problem is simply to 
maximise P subject to these three constraints and the reality constraint. 

Solution 1.3: Let 6, d and / be the number of kilograms of binder, disintegrant and filler in each 
100 kilograms of the formulation. Then since there will be 14 kilograms of active ingredient in each 
100 kilograms of the formulation, b + d+ f = 86. The binder - filler constraint is that 106 > /, while 
the constraint on the disintegrant gives 4d < 6 + 14. These, together with the reality requirement, 
that 6 > 0, d > 0 and / > 0 are all the constraints, and the problem is to minimise the total cost 
C = 506 + 15 d + 2/ subject to these constraints. 

Solution 1.4: Let a, 6 and c be the number of cars of each type that are to be made. The reality 
constraint, that a > 0, 6 > 0 and c > 0 is clearly essential. The labour availabilty in the two 
factories gives: 


8a + 86+ 9c < 10120, 

8a + 96+ 11c < 11000 

An aditional constraint might be the need for a, 6 and c to be integers, although since this is a 
monthly figure, it would be natural to hold uncompleted cars until the following month. The total 
profit made P, in pounds, is then P = 1100a + 12006+ 1450c. 

The mathematical model is unrealistic in many respects. Some factors are: 

• it is very unlikely that the assumption of constant profit per vehicle is true; there are probably 
some fixed costs involved and also capacity problems. 

• there is no reflection of market demand; it is plausible that no cars of type B are made, a 
result which would be unacceptable in practice; and 

• there is an unlikely simplicity in the product range; I would expect there to be many more 
options with varying nett profits in a real situation. 

Solution 1.5: Let ri, r 2 and r% be the number of kilos of cereal, dried fruit and nuts respectively 
which are mixed to make the “Rich” blend, and define h\, 6,2 and 63 and Ci, C2 and C3 to be the 
corresponding weights for the “Healthy” and “Crunchy” mixes. 

The total costs of the cereals is 

C = 1.5 (c 2 + 6,2 + T 2 ) + 1.0(c3 + 63 + 9-3) + 0.8(ri + hi + ci) 

while the total sales income is 


S — 2.0(ri + r2 + + 1.6(ci + C2 + C3) + 1.2(6,i + 62 + 62) 


109 


and the difference S — C between these two figures gives the profit which is to be maximised. 
The constraint that the “Crunchy” blend must contain at least 60% nuts becomes 


c 3 > 0.6(ci + c 2 + c 3 ). 

The other constraints are given in the same way: for the “Healthy” mix, 

hi > 0.6(/ii + h 2 + h 3 ) and h 3 < 0.2(hi + h 2 + h 3 ) 


and for the “Rich” blend, 


r\ < 0.2(ci + r 2 + r 3 ) and r 2 > 0.6(ci + r 2 + r 3 ). 


We have three supply constraints 


Ci T h\ T v i ^ 100, 
c 2 + h 2 + r 2 < 80, 
c 3 + h 3 + r 3 < 60. 

In addition of course we have the reality constraints that hi > 0, Ci > 0 and r* > 0. 


Solutions for Questions 2 (page 27). 


Solution 2.1: We first introduce slack variables and convert to tableau form. 

> with(linalg): A:=matrix(3,2, [1,-1,2,1,-5,-2]): 

> B:=concat(A,diag(l,1,1),vector ([2,7,0])); 


B := 


1-11002 
2 10 10 7 

-5 -2 0 0 1 0 


Since this is a maximising problem, there is a choice of two proper signs for improvement, in 
columns 1 and 2. Choosing column 2 means there is no choice of ratios, since one of the two relevant 
entries is negative and so ignored. 

> C:=pivot(B,2,2) ; 


C:= 


3 0 110 9 

2 10 10 7 

-1 0 0 2 1 14 


After pivoting there is a proper sign for improvement in column 1. Looking at the appropriate 
ratios shows we should swap a 3 out from the basis. 

> E:=mulrow(C,1,1/3):F:=pivot(E,1,1); 


F := 


1 0 


° ‘ ¥ 


0 0 


1 


1 

3 

1 

3 

7 

3 


0 3 
0 1 
1 17 


Making the other choice initially leads to exactly the same choices, done in the opposite order. 
> C:=pivot(B,1,1);E:=mulrow(C,2,1/3):F:=pivot(E,2,2); 


1-1 1002 
0 3-2103 

0 -7 5 0 1 10 


C:= 










110 


APPENDIX A. SOLUTIONS TO EXERCISES 



1 

0 

1 

3 

1 

3 

0 

3 

F := 

0 

1 

-2 

IT 

1 

3 

0 

1 


0 

0 

l 

3 

7 

3 

1 

17 


The maximum value is 17 which occurs when X\ = 3 and X 2 = 1. 


Solution 2.2: We first introduce slack variables and convert to tableau form. 

> with(linalg): A:=matrix(3,2,[-2,1,1,3,3,-4]): 

> B:=concat(A,diag(l,1,1),vector([5,22,0] )); 


B := 


-2 110 0 5 

1 3 0 1 0 22 

3-4001 0 


Since this is a maximising problem, there is only one proper sign for improvement, in column 2. 
Looking at the appropriate ratios shows we should swap a 3 out from the basis. Pivoting, we have:- 
> C:=pivot(B,1,2); 


C:= 


-2 1 10 0 5 

7 0-310 7 

-5 0 4 0 1 20 


Again there is only one proper sign for improvement. This time there is no choice of ratios, 
since one of the two relevant entries is negative and so ignored. 


> E:=mulrow(C,2,1/7); 


E := 


-2 

1 


F:=pivot(E,2,1); 


F := 


-5 0 


1 0 


0 0 


1 

-3 

T 

4 


0 1 - - 0 7 


1 

7 

-3 

T 

13 


1 20 


1 25 


7 7 

There are now no proper signs for improvement, showing the algorithm has converged, the 
maximum value is 25 and this is attained when X\ = 1 and X 2 = 7. 


Solution 2.3: We first introduce slack variables and convert to tableau form. 

> with(linalg): A:=matrix(3,2,[1,-2,2,1,-3,1]): 

> B:=concat(A,diag(l,1,1),vector([5,15,0] )); 


B := 


1-2100 5 

2 1 0 1 0 15 

-3 10 0 10 


Since this is a maximising problem, there is only one proper sign for improvement, in column 1. 
Looking at the appropriate ratios shows we should swap a 3 out from the basis. 

> C:=pivot(B,1,1); 


1-2 1005 

0 5-2105 

0 -5 3 0 1 15 


C:= 
















Ill 


Again there is only one proper sign for improvement. This time there is no choice of ratios, 
since one of the two relevant entries is negative and so ignored. 

> E:=mulrow(C,2,1/5) ; 


E := 


> F:=pivot(E,2,2); 


F := 


-2 

1 

-5 


1 0 


0 0 


1 

-2 

~5~ 

3 

1 

5 

-2 

T 

l 


o i — - o 


o 

1 

5 

0 

2 
5 
1 
5 
1 


5 

1 

15 


0 7 


1 20 


There are now no proper signs for improvement, showing the algorithm has converged, the 
maximum value is 20 and this is attained when Xi = 7 and X 2 = 1. 


Solution 2.4: We first introduce slack variables and convert to tableau form. 

> A:=matrix(3,3,[3,-2,3,-2,1,6,1,3,-1]): 

> B:=concat(A,diag(l,1,1).vector([3,5,0])); 


B := 


3 -2 3 1 0 0 3 

-2 1 60105 

1 3-10010 


This is a minimising problem; we thus swap in either ai or a .2 since the corresponding entries 
in the last row have the proper sign for improvement. With either choice, there is only one positive 
entry in the column about which to pivot. Technically it is easier to pivot about the 1 in a. 2 ■ This 
gives 

> C:=pivot(B,2,2) ; 



‘ -1 

0 

15 

1 

2 

0 

13 

C := 

-2 

1 

6 

0 

1 

0 

5 


7 

0 

-19 

0 

-3 

1 

-15 


Since now ai has the proper sign for improvement, but each entry in that column (apart from 
the objective row) is negative, we deduce that there is no minimum. 

Doing the first pivot about ai gives essentially the same result; in this case the new a 2 has 
the proper sign for improvement but each entry in that column (apart from the objective row) is 
negative. 

> C:=pivot(B,1,1) ; 


C := 


‘ 3 
0 


-2 

-1 

IT 

n 

y 


3 10 0 

2 

8 - ! 0 

-1 

IT 


3 ■ 
7 


0 


-2 


0 


1 


-1 












112 


APPENDIX A. SOLUTIONS TO EXERCISES 


Solution 2.5: Let B = {vi, V 2 ..., v„} be the given basis and let B' = B \ {vi} |J{u}. Then a 
necessary and sufficient condition that B' is a basis is that Ai ^ 0. 

To see this, suppose first that Ai is nonzero. In order to show that B' is a basis, it is enough to 
show it is linearly independent, since it has the same number of elements as the basis B. So assume 
that 


n 

+ £ /jbiVi = 0 ; 

i =2 

we show that each coefficient vanishes. From the given expression for u in terms of B , we have 


n 

M^i v i + E^ 1 + ^ Vi = 

i=2 

and since this is a linear relation between members of the basis B 1 each coefficient must vanish. In 
particular, since Ai ^ 0, we have \i = 0, and so also \ii = 0 for all i. Thus the set B' is linearly 
independent. 

Conversely, suppose that B' is a basis. Then in particlar it spans the space, so we can write 


n 

V = + 

i =2 


and (i is non-zero, since otherwise we have 

n 

v = E^ v *’ 

i—l 

and this is a non-trivial linear combination of (linearly independent) elements from the basis B, 
which cannot occur. 

Since /r is non-zero, we can write 


u = 



E 


i=2 



and the coefficient ^ is non-zero as required. 


Solutions for Questions 3 (page 53). 

Solution 4.1: 


a) In this part of the algorithm, it is necessary to pivot about the element y q+ i )S in order to 
maintain a feasible tableau, and increase the value of w g +i in an attempt to make it become positive. 
Pivoting, we have 


113 


Basis 

ai 

a s 

e 

b 

a h 1 

2 /i,i 

2 / 1,8 

0 

ui (> 0 ) 

a h q 


Vq,s 

0 

U q (> 0 ) 

a h q+ i 

Vq+ 1,1 

2 / 9 + 1 , s 

0 

Uq +1 (< 0 ) 

e 

Vl 

V s 

1 

z 

a h 1 

* 

0 

0 

ui - 2/1,8(119+1/2/9+1+) 

a h„ 


0 

0 

u q ~ 2 / 9 , 8 ( u 9 +l/ 2 / 9 +l + ) 

a s 

* 

1 

0 

H9+1/2/9+I + 

e 

* 

0 

1 

2 - v s {u q+ 1 /y q+ i, s ) 


We have written * for entries which are irrelevant. We now show that each entry in the column 
headed b is > 0, apart from the last entry which is the value of the objective function. This will 
show that the new tableau corresponds to a feasible solution. Note first that since both y q +i, s and 
u q +1 are negative, the new value in this column, (ie the value of x s ), is u q +i/y q +i iS > 0. Each of 
the other basic variables is modified by the addition of — 2/r,s(Ug+i/i/9+i,s), and since we are given 
that —y r , s > 0, it follows that each good row remains good. 


b) We first introduce an artificial variable x§ and slack variable x§. Since we want an initial 
tableau with a basic feasible solution, we rewrite the second inequality so it appears as a bad row. 

> A:=matrix(3,4,[1,1,0,3,-1,-1,1,-2,-6,-7,-1,-15]): 


> B:=concat(A,diag(l,1,1),vector ( [6 ,-5,0])); 


B := 


1 1 0 31006 

-1-1 1-2010-5 

-6 -7 -1 -15 0 0 1 0 


However, out first aim is to remove the artificial variable from the basis. We can pivot about 
any non-zero entry in the first row; we choose to pivot about the first such. 


> C:=pivot(B,1,1) ; 


C := 


1 1 0 3 1 0 0 6 
0 0 11110 1 
0 -1 -1 3 6 0 1 36 


From now on, since x$ is a non-basic variable, we ignore column 5. Since we are minimising, 
there is a proper sign for improvement in column 4. Examining the relevant ratios, shows we must 


pivot about the entry in row 2, giving: 








> E:=pivot(C,2,4); 










' 1 

1 

-3 

0 

-2 

-3 

0 

3 

E := 

0 

0 

1 

1 

1 

1 

0 

1 


0 - 

-1 

-4 

0 

3 

-3 

1 

33 


The resulting tableau has no proper sign for improvement, so the minimum value is 33, which 
occurs when X\ = 3 and X 4 = 1. Of course we can reduce the objective function using column 5, 
but this will result in the artificial variable being non-zero; thus an infeasible solution. 


















114 


APPENDIX A. SOLUTIONS TO EXERCISES 


Solution 4.2: If the problem is unbounded, eventually a column will occur in which the row 
corresponding to the objective function will have the proper sign for improvement, but everything 
above that column will be an inappropriate pivot, in that each entry is negative or zero. Even if 
there are appropriate pivots in other columns, this, by itself, shows that the problem is unbounded 
in that suitable multiples of the solution corresponding to that column can be added to the existing 
solution to make it as large (in the required sense) as chosen. 

To solve the given problem we first introduce slack variables. In doing this we see that the 
second constraint generates a bad row; we move it down so it comes after all the good rows. 

> with(linalg): A:=matrix(4,3, [1,1, -2, 1,3,-8,4,-1,1 ,6,-2 ,3]): 

> B: =concat(A,diag(l, 1 , 1 , 1) ,vector ([2,8,-6,0])); 


'1 1-21000 2 ' 

1 3-80100 8 

B 4 -1 10 0 10-6 

6 -2 30001 0_ 

Treating the bad row as the objective function, there is only one proper sign for improvement, 
in column 2; the corresponding ratios are 2 and 8/3 so we swap a 4 out of the basis. 


> C:=pivot(B,1,2); 


11 -2 1000 2 

-2 0 -2 -3 1 0 0 2 

5 0-1 10 10-4 

80 -1 2001 4 


This still hasn’t made the bad row good. This time, there is a proper sign for improvement in 
column 3, and no choice of pivot row above it. In this situation we know we can make the bad row 
good by pivoting in the bad row itself, which we now do. 

> E:=mulrow(C,3,-l):E=pivot(E,3,3); 



-9 

1 

0 

-1 

0 

-2 

0 

10 

-12 

0 

0 

-5 

1 

-2 

0 

10 

-5 

0 

1 

-1 

0 

-1 

0 

4 

3 

0 

0 

1 

0 

-1 

1 

8 


Now we see the situation we discussed at the start of the question. We are minimising the 
objective function, which thus has a proper sign for improvement in column 1. All the entries above 
it are negative, so we can make the objective function as large and negative as we wish. 


Solution 4.3: 


a) We do the pivot suggested, and show the result below. 


Basis 

... a s ... 

b 

a hi 

... 0 ... 

Ui — yis 

a h r 

... 1 ... 

U r 

Vrs 


Note that since u r > 0 because the initial tableau was feasible, we must have y rs > 0 in order 
that row r be feasible afterwards. The other feasibility requirement is that u, — zfe > 0. If 

yis < 0, this will always hold. However if yi S > 0, then we must have 












115 


and r must be chosen to give the lowest such ratio among all rows i for which yi S > 0 . 


b) To solve the given problem we first introduce slack variables in the first and last constraints, 
and an artificial variable, 25 in the second constraint to get the following set of constraints 

Xi + 23 + 24 = 4 , 

21 — 22 2 — 32 3 + 25 = 2, 

2 i — 322 — 23 + 26 = 1. 

Putting these into tableau form gives the following initial tableau. 

> with(linalg): 

> A -matrix(4,3, [[1,0,1], [1,-2,-3],[1,-3,-1], [2, -1,4]]): 

> B:= concat(A,diag(l,1,1,1),vector([4,2,1,0])); 

1 0 110 0 0 


B := 


4 

2 

1 

0 


1-2-30100 
1-3-10010 
2 -1 4 0 0 0 1 

Since column 5 corresponds to an artificial variable we first swap as from the basis, and then 
leave it out, since this ensures that 25 = 0. It is natural to swap in ai both because it is technically 
easier, and because it keeps row 2 as a good row. 

> C:=pivot(B,2,1) ; 


C := 


2 

-2 

-1 

3 


4 

-3 

2 

10 


-1 

1 

-1 

-2 


2 

2 

-1 

-4 


Now row 3 is a bad row. We treat it as a subsidiary objective function and try to increase its 
value. Ignoring column 5 as we must, there is only one proper sign for improvement, in column 2 
and only one positive pivot, in row 1 . 


E:=pivot(C,1,2):E:=mulrow(E,1,1/2) 

1 
2 


E := 


1 2 - — 


1 

1 

2 

-3 

~2 


0 0 4 — — 0 1 -7 


-1 

2 ~ 

0 

-3 

-1 

2 ~ 


4 

0 


This is a minimising problem, and there is a proper sign for improvement in column 3. The 
lowest ratio (0) occurs in row 3. 


> F:=pivot(E,3,3):mulrow(F,3,1/4); 

0 10 — 
4 

7 


— ——— 0 


10 0 


1 -3 


0 0 1 - — - 0 


0 0 0 -2 


-1 

2 ~ 

-1 

~T 

1 

4 

-1 


1 -7 


Doing the pivot does not improve the objective function, but now we can see the algorithm 
has terminated since there are no remaining proper signs for improvement. (Again we ignore the 










116 


APPENDIX A. SOLUTIONS TO EXERCISES 


artificial variable in column 5.) The minimum value of the objective function is —7, which occurs 
when x\ = 4, xi = 1 and x 3 = 0. 

Solution 4.4: To solve the given problem we first introduce slack variables in the first and last 
constraints, and an artificial variable, x 3 in the second constraint to get the following set of con¬ 
straints 


xi + 3x3 + X 4 = 2, 

2a: 1 - x 2 - x 3 + x 5 = 3, 

-4xi + x 2 + 2 x 3 +x 6 = -7. 

Putting these into tableau form with the objective function xj — + X 2 — I2x 3 = 0 gives the 

following initial tableau. 

> with(linalg): 

> A:=matrix(4,8, [[1,0,3,1,0,0,0,21,[2,-1,-1,0,1,0,0,3], 

> [-4,1,2,0,0,1,0,-7],[-6,1,-12,0,0,0,1,01]); 

10 3 10 0 

2-1 -10 10 
-4 1 2 0 0 1 

-6 1 -12 0 0 0 
Since column 5 corresponds to an artificial variable we first swap as from the basis, and then 
leave it out, since this ensures that x 3 = 0. It is natural to swap in ai because it keeps row 2 as a 
good row. 

> B:=mulrow(A,2,1/2):B:=pivot(B,2,1); 

1 7 

2 2 
-1 -1 


A := 


2 

3 

-7 

0 


B := 


0 


1 — — 0 - 0 0 - 


2 

-1 

-2 


2 

0 

-15 


-1 

1 

2 

2 

3 


0 0 


1 

2 

3 
2 

-1 

9 


Now row 3 is a bad row. We treat it as a subsidiary objective function and try to increase its 
value. Ignoring column 5 as we must, there is only one proper sign for improvement, in column 2 
and only one positive pivot, in row 1. 


C:=mulrow(B,1,2):C:=pivot(C,1,2) ; 

0 1 7 

C := 


0 3 

0 7 

0 -1 


-1 

0 

1 

1 


1 

2 

0 

11 


This is a minimising problem, and there is a proper sign for improvement in column 3. The 
lowest ratio (0) occurs in row 3. 


> E:=pivot(C,3,4); 


E := 


0 

-1 

7 

-15 


-2 

-1 

1 

-1 


-1 

-1 

1 

-2 


1 

2 

0 

11 


Doing the pivot does not improve the objective function, but now we can see the algorithm 
has terminated since there are no remaining proper signs for improvement. (Again we ignore the 










117 


artificial variable in column 5.) The minimum value of the objective function is 11, which occurs 
when xi = 2 , X 2 = 1 and X 3 = 0. 

There seems no good reason to eliminate the artificial variable by pivoting at a negative entry. 
For information here is the result doing it one way: 

> F:=mulrow(A,2,-1):F:=pivot(F,2,2); 


F := 


> G:=pivot(F,1,1) ; 


G := 


1 

—2 
- 2 
-4 

1 

0 


3 1 

1 0 
1 0 
-13 0 


0 0 
0 0 


3 

7 

7 

-1 


0 0 0 
-10 0 

110 
10 1 


0 0 0 


-1 

1 

1 


0 0 
1 0 
0 1 


2 

-3 

-4 

3 

2 

1 

0 

11 


H:=pivot(G,3,4); 


H := 


1 0 — 0 — — 0 


0 1 
0 0 
0 0 


-1 

~2 

0 

7 


-15 0 


-1 

2“ 

-2 

1 

-1 


-1 

~2~ 

-1 

1 

-2 


1 

0 

11 


Making a third choice gives one more step - but the initial pivot really doesn’t look promising. 

> J:=mulrow(A,2,-1):J:=pivot(J,2,3); 


J := 


7 -3 

0 

1 

3 

0 

0 

11 

-2 1 

1 

0 

-1 

0 

0 

-3 

0 -1 

0 

0 

2 

1 

0 

-1 

-30 13 

0 

0 

-12 

0 

1 

-36 

=pivot(K, 1 , 1 ); 





r , -3 


1 

3 



11 


K := 


1 — 0 - -00 — 


7 

1 

7 

-1 

1 

7 


> L:=mulrow(K,2,7):L:=pivot(L,2,2); 

“ 1 0 3 

0 1 7 

0 0 7 

0 0-1 


L := 


7 

2 

7 

0 

30 

y 

1 

2 
2 
4 


7 

-1 

y 

2 

6 

7 


0 0 


7 

1 

7 

-1 

78 

y 


000 

-100 


> M:=mulrow(L,3,1/2):M:=pivot(M,3,4); 

-1 „ 


M := 


1 0 

0 1 
0 0 

0 0 


2 

0 0 

i 1 

-15 0 


1 

1 

-1 

~2 

-2 

1 

2 

-1 


1 0 
0 1 

-1 

~2 

-1 

1 

2 

-2 


2 

1 

0 

11 


1 

0 

11 
















118 


APPENDIX A. SOLUTIONS TO EXERCISES 


Solutions for Questions 4 (page 64). 

Solution 5.1: We first rewrite the primal in “standard form” as: maximise — 6 xi — 7 x 2 — X 3 — 15x4, 
subject to xi,X 2 ,£ 4,£4 > 0 and 


—Xi — X2 + X3 — 2x4 < — 5 , 
xi + X 2 + 3x4 < 6, 

—Xi — X 2 — 3X4 < —6. 


Thus if we let b = (—5, 6 , — 6 ) T , c = (— 6 , —7, 

-1,-15) 

and 

/ - 1 

-1 1 

-2 

A= 1 

1 0 

3 

U 

-1 0 

-3 


the primal is: maximise c T x subject to Ax < b and x > 0. The dual then becomes: minimise b 1 w 
subject to A T w > c and w > 0. In terms of the definitions just given, the dual is thus: 
minimise —5w\ + 6 u >2 — 6 W 3 , subject to uq, 1 x 2 , 1 x 3 > 0 and 

—IXl + W 2 — w 3 > —6, 

—wi + w 2 — W 3 > —7, 

wi > - 1 , 

— 2wi + 3 w2 — 3u>3 > — 15 . 


Note also that by writing W 4 = W 2 —W 3 , we can replace all references to the non-negative variables 
u >2 and W 3 , both in the objective function and the constraints, by the unrestricted variable 1 x 4 . 

We are given in Question 4.1 that the primal has the optimal value 33. Note that this was for 
the given objective function; we changed the sign of this above, so for this primal, the optimal value 
was —33. It follows from the fundamental theorem of duality that the dual has an optimal solution 
with the same optimal value, namely —33. 

Solution 5.2: 


a) The dual problem is to minimise b 2 w subject to A T w > c, w > 0. Now suppose that x and 
w are feasible solutions of the above primal and its dual. Note first that if u < v and x > 0 then 
x T u < x T v since we first multiply an inequality by Xj > 0 and then add. Then, since c < A T w, 
and x > 0, we have 


x T c < x T ri T w = (Ax) t w < b T w 

where in the last inequality we have used the note above and the fact that Ax < b. Since x T c = c T x, 
the result follows. 

If c T x = b T w then x and w are optimal solutions of the primal and dual. To see this, recall 
that for the primal problem, we are trying to maximise 

n m 

o x j < Y b ' wi ' 

j=1 i= 1 

where the inequality is the result we have just proved. If the above inequality is actually an equality, 
then however we change any Xj cannot increase the value of the objective function c T x. And in the 
same way, however we change any Wi cannot decrease the value of the objective function b T w of 
the dual problem. So both must already correspond to optimal solutions; indeed we see that both 
the primal and the dual problem have optimal solutions with the same optimal value. 


119 


b) The original problem was to maximise 2 = 10x 4 + 20x2 — 10x 3 subject to constraints that 
xi > 0, X 2 > 0, x 3 > 0 and 


2 xi + 3x2 — x 3 < 4 
— X\ — 2x2 < 3 
-Xi + x 2 — 2x 3 < 1. 


The dual problem then becomes one of minimising Aw\ + 3w2 + w 3 subject to constraints that 
w 1 > 0, W 2 > 0, w 3 > 0 and 


2 wi — W 2 — w 3 < 10 
3w\ — 2w 2 + w 3 < 20 
—w 1 — 2w 3 < —10. 


For completeness, the full tableau for the primal is given. 


Basis 

a i 

a 2 

a 3 

a 4 

a 5 

a 6 

e 

b 

a 4 

2 

3 

-1 

1 

0 

0 

0 

4 

a 5 

-1 

-2 

0 

0 

1 

0 

0 

3 

a 6 

-1 

1* 

-2 

0 

0 

1 

0 

1 

e 

-10 

-20 

10 

0 

0 

0 

1 

0 

a 4 

5 

0 

5* 

1 

0 

-3 

0 

1 

a 5 

-3 

0 

-4 

0 

1 

2 

0 

5 

a 2 

-1 

1 

-2 

0 

0 

1 

0 

1 

e 

-30 

0 

-30 

0 

0 

20 

1 

20 

a 3 

1 

0 

1 

1/5 

0 

-3/5 

0 

1/5 

a 5 

1 

0 

0 

4/5 

1 

-2/5 

0 

29/5 

a 2 

1 

1 

0 

2/5 

0 

-1/5 

0 

7/5 

e 

0 

0 

0 

6 

0 

2 

1 

26 


This gives an optimum value of 26 for the primal problem with solution X\ = 0, X 2 = 7/5 and 
x 3 = 1/5. The corresponding dual problem has solution w 3 = 6, W 2 = 0 and w 3 = 2. Of course the 
optimal value remains at 26. 

There was a choice of pivot in the second tableau; an equally acceptable choice is to pivot at 
rowl, columnl; doing so gives:- 


Basis 

&i 

a 2 

a 3 

a 4 

a 5 

a 6 

e 

b 

a i 

1 

0 

1 

1/5 

0 

-3/5 

0 

1/5 

a 5 

0 

0 

-1 

3/5 

1 

1/5 

0 

28/5 

a 2 

0 

1 

-1 

1/5 

0 

2/5 

0 

6/5 

e 

0 

0 

0 

6 

0 

2 

1 

26 


which gives the same solution as the above for the dual, and of course the same optimum value of 
—26. The solution to the primal problem differs from the one obtained above, having x 4 = 1/5, 
X 2 = 6/5 and x 3 = 0; however the optimal value of course remains the same. 

Solution 5.3: Given a primal: maximise c T x subject to Ax < b and x > 0, the dual problem 
is: minimise b T w subject to A T w > c and w > 0. Rewriting the primal we have: maximise 
— 8 x 1 — 5 x 2 — 4 x 3 subject to the constraints X\ > 0, X 2 > 0, X 3 > 0, and 


—Xi — 5X2 + x 3 < —1, 
—xi + 2x2 — x 3 < 4. 














120 


APPENDIX A. SOLUTIONS TO EXERCISES 


The dual problem is thus: minimise —w\ + 4w2 subject to the constraints W\ > 0, W 2 > 0, and 

—W\ — W2 > —8, 

— 5w\ + 2 w2 > — 5 , 

W\ — W2 > —4. 

Of course both the requirement (to minimise), and the constraints, can be changed by multiplying 
by an appropriate factor. 

If both the primal and the dual problems have feasible solutions, then both have optimal solu¬ 
tions with the same optimal value. 


Solutions for Questions 5 (page 73). 

Solution 6.1: The payoff matrix to Rowman is 



0 

1 

to 

0 

0 

1 

to 

1 

1 

-2 

3 

2 

2 

CO 

—4 


where we write Columnman’s choice along the top row, and Rowman’s choice down the first column. 

The minima along each row are 0, —2 and —4, with maximum 0, while the maxima in each 
column are 2, 3 and 3, with minimum 2. Since 2 > 0, the game is not strictly determined. 

We first increase the value of the game by 4 to ensure that the corresponding matrix has entries 
all of which are non-negative. The related linear programming problem that Columnman has to 
solve is then to maximise *i + X 2 + * 3 subject to: 

4xi + 5x2 + 6 x 3 < 3, 

5*i + 2*2 + 7*3 < 1, 

6*1 + 7*2 < 1. 


Putting this into the related standard form gives the initial tableau: 


Basis 

ai 

a 2 

a 3 

a 4 

a 5 

a 6 

e 

b 

a 2 

4 

5 

6 

1 

0 

0 

0 

1 

a 3 

5 

2 

7 

0 

1 

0 

0 

1 

ai 

6 

7 

0 

0 

0 

1 

0 

1 

e 

-1 

-1 

-1 

0 

0 

0 

1 

0 


Although not asked for, the derivation of the final tableau is as follows: 

> with(linalg): 

> A:=matrix([ [4,5,6], [5,2,7] ,[6,7,0],[-1,-l,-1] ] ): 

> b:=vector( [1,1,1,0]): 

> A1 : =concat(A,diag(l, 1 ,1,1) ,b); 

'4 5 6 1 0 0 0 1 ' 

5 2 701001 

6 7 000101 

-1 -1 -1 0 0 0 1 0 _ 

> A2:=pivot(A1,3,1):A2:=mulrow(A2,3,1/6); 










121 


A2 := 


1 

3 

-23 

— 

7 

6 

1 

6 


6 10 
7 0 1 

0 0 0 


— 0 - 


-10 0 - 1 - 


-2 

~ 

-5 

6 

1 

6 

1 

6 


> A3:=pivot(A2,2,3):A3:=mulrow(A3,2,1/7); 

0 % 0 1 4 

21 7 

0 Z?! i o I 

42 7 


A3 := 


1-000 

6 

-8 1 

0 — 00- 
21 7 


1 

21 

-5 

42 

1 

6 

1 

21 


> A4:=pivot(A3,1,2):A4:=mulrow(A4,1,21/76); 


A4 


1 
3 
1 
6 
1 
6 
1 
6 J 


- 0 - 


4 

21 

1 

42 

1 

6 

4 

21 


0 0 1 
10 0 
0 0 0 

Thus the optimal value of the linear programming problem is z 
strategies for Rowman and Columnman are both 


21 

-9 

1 

0 

1 

76 

38 

76 

19 

23 

1 

-17 

0 

1 

152 

76 

152 

19 

-49 

21 

23 

0 

2 

152 

76 

152 

19 

2 

1 

1 

1 

4 

19 

19 

19 

19 


19/4, and the optimal 


19/2 1 1 \ _ /I 1 1\ 

T Vl9’ 19’ 19y ~ 1,2’4’4/ ' 

The value of the original game is 1/z — 4 = 3/4. 

Solution 6.2: We label the 6 possible choices for H in an obvious way with the symbols R1C1, 
RIC2 , i?lC3, R2C1, R2C2 and R2C3. Similarly we label the five choices of S as Rl, R2, Cl, C2 
and C 3. We then describe all possible outcomes of the game in the following matrix, in which each 
row is labelled with a different choice by H and each column with a different choice by S. 



R1C1 

R1C2 

R1C3 

R2C1 

R2C2 

R2C3 

Min 

Rl 

1 

2 

3 

-1 

-1 

-1 

-1 

R2 

-1 

-1 

-1 

2 

4 

3 

-1 

Cl 

1 

-1 

-1 

2 

-1 

-1 

-1 

C2 

-1 

2 

-1 

-1 

4 

-1 

-1 

C3 

-1 

-1 

3 

-1 

-1 

3 

-1 

Max 

1 

2 

3 

2 

4 

3 



This matrix then becomes the payoff matrix for S. Around the matrix, the minimum in each row, 
and maximum in each column have been computed. We have 


a = max min a,ij (3 = min max a,ij. 
i o i i 

Thus a = — 1 and (3 = 1 and the game is not strictly determined. 
















122 


APPENDIX A. SOLUTIONS TO EXERCISES 


Solution 6.3: Let A = [a, 7 ] and recall that if a = max, min 7 a tJ and /? = min 7 max, a .^, then a 
and f3 are respectively the lower and upper values for the matrix game A. 

In our case, 


A = 



Min 

2 

t while B = 



Min 

2 

4 

2 


Max 8 6 


Max 8 6 


and so for A, we have a = (3 = 6 and the game is strictly determined, while for B we have a = 4, 
while /3 = 6. 

We solve the game B using the simplex method, so we seek a y which is an optimum strategy 
for Columnman; ie we maximise y i + yi subject to By < 1 and y\ > 0, yi > 0. 

The initial tableau is 


> B:=matrix(4,2, [2,6,8,4,4,2,-1,-1]): 

> B:=concat(B,diag(l,1,1,1),vector([1,1,1,0])); 

2 6 1 0 0 0 1 


B := 


8 4 0 1 0 0 1 

4 2 0 0 1 0 1 

- 1-100010 
Following the simplex algorithm, we choose to pivot about the element 8 in column 1, since each 
of columns 1 and 2 have the proper sign for improvement. Pivoting gives the tableau 


> C:=pivot(B,2,1):C:=mulrow(C,2,1/8); 

5 


C := 


1 

2 

0 

-1 

TT 


-l 

~T 

l 

8 

-1 

T 

l 


0 0 
0 0 
1 0 
0 1 


Continuing, we see that column 2 has the proper sign for improvement, and we must pivot about 
5, giving the tableau 


> E:=pivot(C,1,2):E:=mulrow(E,1,1/5); 

1 

5 

-1 

To 


E := 


0 1 
1 0 
0 0 


0 — 


0 0 


1 

10 


-1 

20 

0 

0 

3 

20 

3 

20 

0 

0 

1 

20 

-1 

TT 

l 

0 

1 

2 

l 

10 

0 

i 

1 

5 


1 3 


1 1 


forRowmanis T l-,-,0j = ^-,-,0 


1 \20 20 
1 1 


1 3 


The optimum strategy is thus — ( ——, —— ) = I —, — 1 for Columnman. The optimal strategy 


4’ 4 








123 


The expected payoff if both players use this strategy is thus 


1 

4 


3 / 6 

4 \2 


= 5. 


The first game thus has a larger expected payoff to Rowman, and so is the logical choice. 

Solution 6.4: Let A = [oy] and recall that if a = max^mim,- aij and /3 = min., max^ djj , then a 
and /3 are respectively the lower and upper values for the matrix game A. 

In our case, 


Min 


A = 

Max 



2 

2 

1 


and so a = 2, while /? = 3, and the game is not strictly determined. It is clear that row 3 will not 
figure in Rowman’s solution, since whether Columnman chooses column 1 or column 2, Rowman is 
better off choosing row 2. 

We solve the game A using the simplex method, so we seek a y which is an optimum strategy 
for Columnman; ie we maximise y\ + j /2 subject to Ay < 1 and y± > 0, yi > 0. 

The initial tableau is 


Basis 

a i 

a 2 

a 3 

a 4 

a 5 

e 

b 

a 3 

3 

2 

1 

0 

0 

0 

1 

a 4 

2 

6* 

0 

1 

0 

0 

1 

a 5 

1 

5 

0 

0 

1 

0 

1 

e 

-1 

-1 

0 

0 

0 

1 

0 


Following the simplex algorithm, we choose to pivot about the element indicated 6*, since each 
of columns 1 and 2 have the proper sign for improvement. Pivoting gives the tableau 


Basis 

a i 

a 2 

a 3 

a 4 

a 5 

e 

b 

a 3 

7/3* 

0 

1 

-1/3 

0 

0 

2/3 

a 2 

1/3 

1 

0 

1/6 

0 

0 

1/6 

a 5 

-2/3 

0 

0 

-5/6 

1 

0 

1/6 

e 

-2/3 

0 

0 

1/6 

0 

1 

1/6 


Continuing, we see that column 1 has the proper sign for improvement, and we must pivot about 
7/3*, giving the tableau 


Basis 

a i 

a 2 

a 3 

a 4 

a 5 

e 

b 

a i 

i 

0 

3/7 

-1/7 

0 

0 

2/7 

a 2 

0 

1 

-1/7 

3/14 

0 

0 

1/14 

a 5 

0 

0 

2/7 

-13/14 

1 

0 

5/14 

e 

0 

0 

2/7 

1/14 

0 

1 

5/14 


The optimum strategy is thus 


T, . 14 

Rowman is — 
5 



14/4 1 \ 

~5~ Vl4’ 14/ 




for Columnman. The optimal strategy for 




















124 


APPENDIX A. SOLUTIONS TO EXERCISES 


The expected payoff if both players use this strategy is thus 


4 / 4 1 

- 3.- + 2.- 

5 1 5 5 


- ( 2 .- 

5 V 5 


6 .-) = —. 

5/ 5 


Note also that this lies between a and (3. 

There is a choice above. The other way looks essentially equivalent: 

> with (linalg): 

> A:=matrix(4,7, [ [3,2,1,0,0,0,1], [2,6,0,1,0,0,1], [1,5,0,0,1,0,1], [-1,-1 

> , 0 , 0 , 0 , 1 , 01 ]); 


A ■= 


E := 


> F:=mulrow(E, 2 , 3 / 14 ):F:=pivot(F, 2 , 2 ); 

3 


3 

2 

1 

0 

0 

0 

1 

2 

6 

0 

1 

0 

0 

1 

1 

5 

0 

0 

1 

0 

1 

-1 - 

-1 

0 

0 

0 

1 

0 

:=pivot(E, 1 , 1 ); 




1 1 

3 


1 

3 

0 

0 

0 

1 

3 

» f 


-2 

1 

0 

0 

1 

3 

» f 


-1 

IT 

0 

1 

0 

2 

3 

» 1 1 


1 

3 

0 

0 

1 

1 

3 


F := 


1 0 
0 1 
0 0 
0 0 


7 

-1 

T 

2 

7 

2 

7 


-1 

T 

0 

0 

2 

7 

3 

14 

0 

0 

1 

14 

-13 

14 

1 

0 

5 

14 

1 

14 

0 

1 

5 

14 


Solutions for Questions 6 (page 86). 

Solution 7.1: Suppose there are constants Ai, A 2 ,..., Xk £ ffi such that xo £ Ill" is an unconstrained 
optimum (max or min) of 


k 

Vi( x ) 

i=i 

and that in addition Cj (x 0 ) = 0 for 1 < j < k. Then xo is a solution of the constrained optimization 
problem: 


maximise /(x) subject to c, (x) = 0 for (1 < j < k). 
We illustrate this with the given example; let 

L(x) = L(x, A) = x\ -\ - hij- A(a;i + ... x n - 1) 









We seek critical points; they occur when 


125 


— 2 Xi A — 0 — X\ x n 1 — 0. 

oxi a A 

There is a solution, when Xi = A/2 and so nX = 2; thus when X\ = • • • = x n = 1/n. This solution is 
unique and is clearly a global minimum of L(x, A), and hence of the given constrained optimization 
problem. 

Solution 7.2: 

a) By Lagrange’s principle, the required constrained maximum will occur at a local extremum 
of 


F(x i,x 2 ,x 3 , AiA 2 ) = x\ + x\ + x\ - Ai(xi + 2 x 2 + x 3 - 4) + -A 2 (17 — xi + 2 x 2 - 2x 3 ). 


Then 


dF 

0 = —— = 2xi — Ai + A 2 , 

OX i 

dF 

0 = —— = 2x 2 — 2Ai — 2A 2 , 
0x2 

dF 

0 = 7 - — = 2x 3 — Ai + 2A 2 
0x3 


x i — ~ -^ 2 ), 

x 2 = Ai + A 2 , 


x 3 — 2 ^! — 2A 2 ). 


Using these values in the constraints (also obtained by taking the remaining partial derivatives), we 
have 

0 = Xi + 2 x 2 + x 3 — 4 = — Ai — — A 2 + 2A 2 + 2Ai + — Ai — A 2 — 4, 
and thus 8 = A 2 + 6 A 1 . The second constraint then gives 


0 — X\ — 2x 2 + 2x 3 — 17 — —(Ai — A 2 ) — 2A 2 — 2Ai + Ai — 2A 2 — 17, 


and thus —Ai — 9A 2 = 34. 

Solving this pair of equations gives Ai = 2 and A 2 = —4, and then X\ = 3, x 2 = —2 and x 3 = 5. 
To see this is a minimum, rather than a maximum, note that x 2 can be made arbitrarily large; this 
simply fixes Xi and X 3 . 

b) If the objective function / for a differentiable non-linear programming problem is convex 
on a domain D and all the constraints are concave on D then every solution which satisfies the 
Kuhn-Tucker conditions is necessarily a global minimum. 

The Lagrangian is as above; this then gives the location of the critical points. Using these 
equations, The Kuhn-Tucker conditions are thus Ai > 0, A 2 > 0 and:- 


2xi — Ai + A 2 — 0, 
2 x 2 — 2Ai — 2A 2 = 0, 
2x 3 — Ai + 2A 2 = 0, 
xi + 2x 2 + x 3 — 4 > 0, 
17 — xi — 2x 2 — 2x 3 > 0, 
Ai(xi + 2x 2 + x 3 - 4) = 0, 
A 2 (17 — Xi + 2 x 2 — 2x 3 ) = 0. 


126 


APPENDIX A. SOLUTIONS TO EXERCISES 


Of course, if Ai ^ 0 and A 2 yf 0 each constraint is tight, and we have the same solution as before. 
We may exclude it because that solution gave A 2 = —4, which fails the KT conditions. 

If Ai = 0 and A 2 = 0, the only solution is X\ = 0, 22 = 0 and 23 = 0, which is not feasible since 
X\ + 2z 2 + 23 > 4. 

If Ai = 0, by KT and the above we must have A2 > 0 and 17 = 21 + 222—223. Since 21 = — A2/2, 
22 = A2 and 23 = —A2, we see that 21 — 222 + 223 < 0, and again we reach a contradiction. The 
remaining possibility is that Ai > 0 and A2 = 0. In this case, 21 = Ai/ 2 , 22 = Ai and 23 = Ai/ 2 . 
Since by KT we must have 21 + 222 + 23 = 4 , 

11 4 

—Ai + 2Ai + —Ai = 4 so that Ai = —. 

2 2 3 


The corresponding solution is 


2 4 2 
3’ 3’ 3 


and the minimum value attained is 


24 

~9~‘ 


Note also that 


the constraints are indeed satsified at this point. Finally, observe that since the objective function 
is convex, and the constraints are linear, and hence concave, the solution is a global minimum. 


Solution 7.3: We show below the two constraints and two different positions of the objective 
function 



Figure A.l: The feasible region is above the parabola and below the circle. 

The objective function becomes smaller as the intercept on the 22 - axis increase; in other words, 
as the line moves to the northwest! 

The Lagrangian is 


L( 21 , 22 , A, /i) = 21 — 22 — A(1 — x\ — x\) — /z(322 — 2xl). 

Using these equations, The Kuhn-Tucker conditions are thus A > 0, /j > 0 and 

1 + A22i + AfiX\ =0, —1 + 2 A 22 — 3/2 = 0, 

1 — x\ — x\ >0, 22 i — 322 > 0, 

A(1 — x\ — 2 j) = 0, n( 2x\ — 322 ) = 0. 

and solutions to these equations give the local minimisers — strictly unless a certain tangent con¬ 
dition is non-degenerate, as it is in this case. 

The first equation shows that A = /i = 0 does not give a solution; thus at least one of the 
constraints is tight. 

If A = 0, the second equation gives /r = —1/3, which contradicts the fact that n> 0. 



127 


If A > 0 and /i > 0 so both constraints are tight, x\ and *2 satisfy 

x i + x 2 = 1; 2*i = 3*2. 


Eliminating X\ gives 


3*2 + 2*2 — 2 = 0 = ( 2*2 — 1)(*2 + 2 ) 

and *2 = 1/2, since the other solution, *2 = —2 does not give a real solution for X\. This gives 
*1 = ±\/3/2 and a minimum value for z of — (\/3 — l)/2. 

Finally we consider the case /x = 0,A>0so*f + *| = l. From our diagram we expect this to 
give the global minimum. Since ^ = 0, the first equation gives 1 + 2A*i = 0, and *1 = — 1/(2A), 
while the second equation gives *2 = 1/(2A). Since A > 0, the “circle” constraint, that x\ + *| < 1 
is tight (ie an equality) and 2A 2 = 1. This corresponds to the solution x\ = — I/a/2, £2 = I/a/ 2 and 
an objective value of z = — \/2 which is thus the minimum. 


Solutions for Questions 7 (page 104). 

Solution 9.1: This is a brief sketch of the expected answer. 

The aim in a genetic algorithm is to encode possible solutions in strings. For the example given, 
it is reasonable to use a binary string, which represents the particular value of n which is the solution 
encoded by the string. Associated with each string is a notion of fitness, indicating how good the 
string is as a solution of the problem. Again in the example, the fitness function is clear; it is to be 
the value of the objective function at the number represented by the stirng. Strings with a greater 
fitness represent better attempts at the required maximisation. 

An initial population of strings is chosen at random, and the fitness of each string is calculated. 
Pairs of strings are then selected for mating, and offspring collected to form the next generation. 
Selection for mating is biased in such a way that fit strings are most likely to be selected. For 
simplicity, we assume that each pair of “parent” strings produces two “offstrings” (groan). Mating 
continues until (for convenience) the new population is the same size as the old one, when a single 
generation is complete. 

Mating itself consists of interchanging “genetic material”. One or more cuts points are selected, 
each parent’s string is cut at these points, and crossover occurs, so that each offstring is build from 
a mixture of both parents’ strings. One further process is necessary, that of a random mutation, in 
which each bit in a child string is considered for flipping. This is allowed to occcur with a very low 
probability as a way of introducing useful parts of the string spectrum which have otherwise never 
been explored, or have died out. 

In operation, the population is allowed to evolve over a large number (perhaps 1000) of genera¬ 
tions, and the fitest individual in the final population is taken as the solution. A variant stores the 
best individual encountered during the evolution, perhaps even allowing it to be immortal, at least 
until a fitter individual is encountered. 

Two additional process should be mentioned. One of these is the code used initially. Although 
binary code is suggested above, a Gray coding is preferable, so that adjacent values of n are coded 
with strings which are always very similar, as oppposed to the “Hamming cliffs” which occur eg 
between 15 and 16 in standard binary encoding. The second process, called “fitness scaling” is 
needed to avoid the population between swamped by a few very fit individuals, to the exclusion 
of other genetic material. The fitness of fit individuals is often limited to some maximum value, 
perhaps twice that of the average fitness, in order to allow a more varied next generation. 


Solution 9.2: 


128 


APPENDIX A. SOLUTIONS TO EXERCISES 


a) Given the problem, in which we have to choose a binary state of each of a finite number of 
gates, it is clear how we can encode a program as a binary string. We are given a real number x , 
say, which measures the difference between the output from the program and the desired output; 
we take the fitness of the corresponding string to be 1/a;; strings with a greater fitness represent 
better attempts at the required program. 

An initial population of strings is chosen at random, and the fitness of each string is calculated. 
Pairs of strings are then selected for mating, and offspring collected to form the next generation. 
Selection for mating is biased in such a way that fit strings are most likely to be selected. For 
simplicity, we assume that each pair of “parent” strings produces two “offstrings” (groan). Mating 
continues until (for convenience) the new population is the same size as the old one, when a single 
generation is complete. 

Mating itself consists of interchanging “genetic material”. One or more cuts points are selected, 
each parent’s string is cut at these points, and crossover occurs, so that each offstring is build from 
a mixture of both parents’ strings. One further process is necessary, that of a random mutation, in 
which each bit in a child string is considered for flipping. This is allowed to occur with a very low 
probability as a way of introducing useful parts of the string spectrum which have otherwise never 
been explored, or have died out. 

In operation, the population is allowed to evolve over a large number (perhaps 1000) of gener¬ 
ations, and the fittest individual in the final population is taken as the solution. A variant stores 
the best individual encountered during the evolution, perhaps even allowing it to be immortal, at 
least until a fitter individual is encountered. A process called called “fitness scaling” may be needed 
to avoid the population between swamped by a few very fit individuals, to the exclusion of other 
genetic material. The fitness of fit individuals is often limited to some maximum value, perhaps 
twice that of the average fitness, in order to allow a more varied next generation. 

In case the example seems fanciful, I saw it running in Sussex; it was also featured in recent 
EPSRC publicity. 

Solution 9.3: Linear optimisation methods can be applied to an objective function which is a 
linear (strictly affine) function, and for which the constraints are also linear or affine. An algorithm 
is then available for converting such problems into a common “tableau” form. The simplex method 
can then be applied, either to move from one feasible solution to a better one, or in the two-phase 
case, to move towards a feasible solution in the first case, from which point the one-phase algorithm 
can be applied. The algorithm can terminate in one of three ways; there may be no feasible solution, 
in which case the two-phase algorithm terminate before it finds one; or an unbounded one parameter 
familiy of solutions will be found, each better than the previous one, or termination will occur at 
an optimal solution. 

Typical problems are like the wine blending one above, or others, such as the transport problem, 
in which the linear and feasibility assumptions are natural. 

None linear problems have a much wider range of applicability, but can be thought of as optim¬ 
ising a (not necessarily) linear objective function subject to constraints which are themselves are 
not necessarily linear. One such methodology is a genetic algorithm, in which different solutions 
are coded in (binary) strings, thought of a genes, and for which the objective function, applied to a 
binary string, is considered to measure the fitness of the string. A population of strings is chosen 
at random, and allowed to evolve, by processes including the natural selection of fit strings, mating 
of strings to include exchange of substrings, and a low-probability mutation process which aims to 
improve genetic diversity. The effcet of such an algorithm is to explore the whole solution space 
more efficiently than by exhaustive search. 

Typical problems here are very much more divers, from simle non-linear optimisation, to gen¬ 
erating rules for efficient game playing. 

In general linear methods will be very much faster than non-linear ones, although with a much 
more restricted problem domain. However there are specialised non-linear algorithms which are 
themselves very efficient, when they embody much knowledge about the specific problem they are 
trying to solve. More general algorthms, such as genetic and simulated annealing ones have a wide 


129 


applicability; although they are not as good as specific algorithms for a given problem, they can 
often be applied rapidly to new types of problems. 

If a problem was suitable for both a linear and non-linear method to be applied, then the problem 
is necessarily a linear one, and as such the linear methods are necessarily the most efficient, being 
designed for the problem. Some non-linear methods may do well, particularly calculus based ones, 
which make the assumption that the objective function and constraints are locally linear. 

Solution 9.4: For nearly linear problems, typically those that are convex, some variant of hill 
climbing can be used, moving in the direction of the largest rate of change of the objective function, 
and again having a guarantee of convergence. This could converge quickly, but with an ill conditioned 
objective function, could be much slower. For a more general problem, there are many less traditional 
methods, such as, for example a genetic algorithm, or simulated annealing, in which less structure 
is needed and a good approximation to the extremum is likely, although exhaustive search is a 
theoretical possibility, it is ruled out on performance grounds; in any case such a method is very 
slow compared with those for which more is known about the objective function. 

If both linear and non-linear methods are applicable to the same problem, the linear one would 
be superior, being in general both more effective and faster. In fact some non-linear methods can 
reduce to the corresponding linear ones in a linear situation, but the purpose built methods will 
still be much more efficient; for example the simplex algorithm has a very simple accurate stopping 
rule. 

Solution 9.5: 

a) One variant of the Travelling Salesman Problem is: 

A salesman has to visit n cities and return to his city of origin. Each city has to be 
visited exactly once, and the cost of the journey between each pair of cities is known. 

The problem is to do the tour at minimum total cost. 

We show the calculation below. We have chosen A as the starting point and listed all possible 
tours; there is a unique tour with the minimum length of 26, namely ACDB. 


Tour 

Step 1 

Step 2 

Step 3 

Step 4 

Total 

A 

B 

C 

D 

4 

6 

10 

11 

31 

A 

B 

D 

C 

4 

8 

9 

7 

28 

A 

C 

B 

D 

6 

10 

8 

11 

35 

A 

C 

D 

B 

6 

10 

7 

3 

26 

A 

D 

B 

C 

12 

7 

6 

7 

32 

A 

D 

C 

B 

12 

9 

10 

3 

34 


b) The two essential ingredients of a genetic algorithm are the availability of a natural rep¬ 
resentation of the problem and its constraints, which leads to suitable genetic operations, and the 
availability of a useful fitness function. In this problem, the fitness function is clear; two tours 
can be evaluated simply by looking at the total cost of each tour; in contrast there is no natural 
representation, which describes the problem. 

Of course a naive binary representation of each tour is available, in which each city is numbered, 
and a gene is simply an ordered list of city numbers. This representation does not however lend 
itself to useful genetic operations; a naive mating may not even produce a tour (some city numbers 
may be invalid), and is unlikely to be a valid tour with no repeat cities. 

We are thus led to seek alternative (non-binary) representations which are adapted to this 
particular problem. Here are some possible representations. 

• the ordinal representation, in which the tour (1,2,4,3, 8,5, 9,6, 7) is represented as the list 
(1,1,2,1,4,1,3,1,1) in which the code refers to the position of the next city of the list of 
unused cities. 





130 


APPENDIX A. SOLUTIONS TO EXERCISES 


• the path representation, which was the natural one we used above; and 

• the adjacency representations in which the tour (1,2,4,3, 8,5, 9,6, 7) in the path representa¬ 
tion is given as (2,4,8, 3,9, 7,1, 5,6), indicating by the entry in position k that the tour goes 
from k to this new city. 

Even when cross-over is defined, it does not lead to natural “genetic evolution” and so it is necessary 
to have much more specialised crossover operators, in which the naive cross-over is “repaired”. 

c) We simply use the obvious path representation described above; thus each potential tour 
can be described by a re-arrangement i of the city labels. 

We try to generate “intelligent” changes, and indeed this is at the heart of the method. Here 
are two possibilities: 

• a section of the tour is cut out and replaced by one running the other way round; or 

• the section is replaced by a random re-arrangement of that section. 

The objective function is again the one given above. We choose the “temperature” parameter 
T so that initially it will allow such changes with high probability. Then decrease T by (say) 10% 
each time as part of the cooling schedule, and run the algorithm until, for example, either 

• a total of 100./V reconfigurations have occurred; or 

• a total of 10iV successful reconfigurations have occurred. 

Next, decrease the temperature by 10% and repeat the calculations. Continue in this way until 
there is so little change that the system seems to have converged. 


Bibliography 


Char, B. W., Geddes, K. 0., Gonnet, G. H., Leong, B. L., Monagan, M. B. & Watt, S. M. 
(1992), First Leaves: A Tutorial Introduction to Maple V, Springer-Verlag. 

Goldberg, D. E. (1989), Genetic algorithms in Search, Optimization, and Machine Learn¬ 
ing, Addison-Wesley. pp412. 

Hartley, R. (1985), Linear and Nonlinear Programming: An Introduction to Linear Methods 
in Mathematical Programming, Ellis Horwood. 

Kaplan, W. (1999), Maxima and Minima with Applications, John Wiley and Sons Inc. 

Kolrnan, B. & Beck, R. E. (1995), Elementary Linear Programming with Applications, 
second edn, Academic Press. 

Metropolis, N., Rosenbluth, A., Rosenbluth, M., Teller, A. & Teller, E. (1953), ‘Simulated 
annealing’, Journal of Chemical Physics 21 , 1087-1092. 

Michalewicz, Z. (1996), Genetic Algorithms + Data Structures = Evolution Programs, third 
edn, Springer. 

Michalewicz, Z. & Fogel, D. B. (2000), How to Solve It; Modern Heuristics, Springer. 

Mitchell, M. (1996), An Introduction to Genetic Algorithms, The MIT Press. 

Ngo, T. & Bhadkamkar, N. (1998), The cocktail party effect, in ‘New Scientist’, number 
2159, Reed Business Information, pp. 51 - 52. 

Press, W. H., Flannery, B. P., Teukolsky, S. A. & Vetterling, W. T. (1992), Numerical 
recipes in C. The Art of Scientific Computing, second edn, Cambridge University 
Press. 

Smythe, W. R. & Johnson, L. A. (1966), Introduction to Linear Programming with Applic¬ 
ations, Prentice-Hall. 

Winston, W. L. (1995), Introduction to Mathematical Programming: Applications and Al¬ 
gorithms, second edn, Duxbury. 


131 


Index Entries 


pivotvar, 33 
ratio, 33 

aims, iii 
algorithm 

genetic, 87 
annealing, 101, 102 
simulated, 101 
annealing schedule, 103 
artificial variable, 45 
artificial variables, 45 
assignment 
maple, 31 
associated, 14 
associated solution, 14 
augmented matrix, 15 

bad, 42 
bad rows, 42 
basic column, 14 
basic feasible solution, 14 
distinct, 22 

basic solution, 14, 19, 20 
degenerate, 14 
basic variable, 14 
basis 

change, 19 
changing, 17 
replacing, 20 
big M method, 45 
black box, 89 
Bland’s Rule, 27 
books, viii 

canonical form, 16 
change of basis, 17 
co-operation, 99 
coding 

Gray, 88 

column space, 13 
competition, 99 
competitive pressure, 92 
concave, 85 

conjugate gradient, 101 
conservative, 66 


constrained optimisation, 1 
constraint 
tight, 2 
constraints, 6 
convergence proof, 24 
convex, 84 
convexity, 84 
course aims, iii 
critical point, 76 
crossover, 90 
current hilltop, 93 
cycling, 26, 36 
example, 27 

decision variables, 2 
degeneracy, 26 
degenerate, 14 

degenerate basic solution, 14 
diversity 

genetic, 91 
dominance, 92 
dual, 57 

dual problem, 57 
dual problems, 63 
duality, 57 
duality theorem, 58 

elementary matrix, 15 
evolutionary step, 90 
exceed, 30 

exhaustive search, 89 
expectation, 68 

facility location problem, 75 
feasible region, 1, 7 
feasible solution, 7, 20 
Fermat-Weber, 75 
first derivative test, 76 
fitness, 89 
fitter, 89 
fittest 

survival, 87 
flow chart 

one phase, 23 
two phase, 42 


fundamental duality the¬ 
orem, 59 

Gene Pool, 89 
genetic, 87 
genetic algorithm, 87 
good, 42 
good rows, 42 
Gray coding, 88, 92 

Hamming cliff, 87 
hill-climbing, 93 
Hoffman and Kruskal’s The¬ 
orem, 96 

integral constraints, 96 

Kuhn-Tucker, 82 

Lagrange’s Principle, 79 
Lagrangian function, 80 
learning outcomes, iii 
left hand variables, 9 
Linear Programming Prob¬ 
lem, 6 

Linear Programming Prob¬ 
lem, 1 
list 

maple, 32 
logging on, 29 
lower, 66 

machine shop problem, 2 
Maple, vii, viii, 29 
mating-pool, 90 
matrix game, 65 
Maxima and Minima, 76 
maxmin, 66 
mediocrity, 92 
Men dip Metals, 5, 39 
mimetic, 97 
minimise 

in maple, 35 
minrnax, 66 
Mondrians, 99 
Muchals, 5 


132 


133 


Mutation, 91 
mutation, 91 

Natural Selection, 90 
non-feasible solution, 7 
non-negativity, 7 
none, 50 

non-negativity restrictions, 7 
NP-complete, 94 

objective function, 1, 7 
objective functions 
multiple, 100 
objective variable, 7, 19 
operands 
list, 32 

optimal feasible vector, 1 
optimal solution, 7 
optimal value, 7 
optimum lower, 68 
optimum upper, 68 
optimum value, 1 
other constraints, 52 
outcomes, iii 

payoff, 65 
pill, 63 
pivot row, 17 
pivoting, 17, 19 
in maple, 35 
primal, 57 
primal problem, 57 
printing, 31 
Prisoners Dilemma, 99 
problems 
dual, 63 


process model, 89 
proper sign for improvement, 
21 

ratio theorem, 20 
reality constraints, 7 
related standard form, 7 
replace, 17 
replace in a basis, 17 
restricted normal form, 9 
right hand variables, 9 
roulette wheel breeding, 91 
row operations, 15 
row-equivalent, 14, 25 

saddle point, 67 
salesman 

wicked, 63 

Second Derivative Test, 77 
second phase, 41 
selection strategies, 91 
set 

maple, 32 
shadow price, 63 
simplex algorithm 
one-phase, 22 
two phase, 42 

simulated annealing, 94, 101 
sites, 90 

slack variables, 8 
solution 

basic, 14 
associated, 14 
basic feasible, 14 
standard form, 7 
steepest descent, 101 


strictly determined, 67 
student 

poor, 62 
subtour, 95 
sysa, 29 

tableau, 16 
teacher 

redundant, v 
termination, 24, 26, 27 
theorem 

fundamental duality, 59 
integer values, 96 
tight, 2 
tour, 94 

transport problem, 3 
transport problems, 3 
Travelling Salesman Prob¬ 
lem, 94 
tutorials, viii 

two-dimensional problem, 1 
two-phase tableau, 41 

unassigning, 32 
unrestricted problem, 57 
upper, 66 
using maple, 29 

value, 67, 68 
optimum, 1 
variables 

artificial, 45 
vitamin, 62 

X Windows, 29 
xterrn, 30 


